direct estimate
Fully Bayesian Spectral Clustering and Benchmarking with Uncertainty Quantification for Small Area Estimation
In this work, inspired by machine learning techniques, we propose a new Bayesian model for Small Area Estimation (SAE), the Fay-Herriot model with Spectral Clustering (FH-SC). Unlike traditional approaches, clustering in FH-SC is based on spectral clustering algorithms that utilize external covariates, rather than geographical or administrative criteria. A major advantage of the FH-SC model is its flexibility in integrating existing SAE approaches, with or without clustering random effects. To enable benchmarking, we leverage the theoretical framework of posterior projections for constrained Bayesian inference and derive closed form expressions for the new Rao-Blackwell (RB) estimators of the posterior mean under the FH-SC model. Additionally, we introduce a novel measure of uncertainty for the benchmarked estimator, the Conditional Posterior Mean Square Error (CPMSE), which is generalizable to other Bayesian SAE estimators. We conduct model-based and data-based simulation studies to evaluate the frequentist properties of the CPMSE. The proposed methodology is motivated by a real case study involving the estimation of the proportion of households with internet access in the municipalities of Colombia. Finally, we also illustrate the advantages of FH-SC over existing Bayesian and frequentist approaches through our case study.
- North America > United States > California > Yolo County > Davis (0.40)
- Africa > Sub-Saharan Africa (0.14)
- South America > Colombia > La Guajira Department > Riohacha (0.04)
- (7 more...)
- Research Report (0.64)
- Workflow (0.46)
- Health & Medicine (0.67)
- Government (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Evaluation of A Spatial Microsimulation Framework for Small-Area Estimation of Population Health Outcomes Using the Behavioral Risk Factor Surveillance System
Von Hoene, Emma, Gupta, Aanya, Kavak, Hamdi, Roess, Amira, Anderson, Taylor
The field of population health addresses a wide spectrum of challenges, spanning infectious and chronic diseases to mental health and health risk behaviors such as smoking and alcohol consumption (Sharma et al., 2025). A common barrie r to addressing these issues is the lack of ground truth data capturing health outcomes and behaviors at fine geographic scales. This limits both local and national health decision - makers in planning and management efforts, such as identify ing health inequalities or targeting interventions where they are most needed (Rahman, 2017; Wang, 2018) . T o fill this gap, researchers use small area estimation (SAE), a collection of statistical methods that combine survey and geographic data to generate estimates of population - level health outcomes at various spatial scales (RTI International, 2025) . There are numerous methods for generating SAE of health outcomes, which can generally be grouped into two main approaches: direct and indirect model - based estimates (Rahman, 2017) . Direct estimates are calculated using only the survey responses from individuals or households sampled within the specified geographi c areas (counties, states) to estimate disease prevalence or other population characteristics.
- North America > United States > New York (0.05)
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (11 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.67)
Adaptive Sequential Machine Learning
Wilson, Craig, Bu, Yuheng, Veeravalli, Venugopal
A framework previously introduced in [3] for solving a sequence of stochastic optimization problems with bounded changes in the minimizers is extended and applied to machine learning problems such as regression and classification. The stochastic optimization problems arising in these machine learning problems is solved using algorithms such as stochastic gradient descent (SGD). A method based on estimates of the change in the minimizers and properties of the optimization algorithm is introduced for adaptively selecting the number of samples at each time step to ensure that the excess risk, i.e., the expected gap between the loss achieved by the approximate minimizer produced by the optimization algorithm and the exact minimizer, does not exceed a target level. A bound is developed to show that the estimate of the change in the minimizers is non-trivial provided that the excess risk is small enough. Extensions relevant to the machine learning setting are considered, including a cost-based approach to select the number of samples with a cost budget over a fixed horizon, and an approach to applying cross-validation for model selection. Finally, experiments with synthetic and real data are used to validate the algorithms.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Illinois (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (6 more...)
- Education > Educational Setting > Online (0.46)
- Education > Focused Education > Special Education (0.45)