AITopics

2512.15987

Country: North America > United States (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
(2 more...)

Fully Bayesian Spectral Clustering and Benchmarking with Uncertainty Quantification for Small Area Estimation

Fúquene-Patiño, Jairo

In this work, inspired by machine learning techniques, we propose a new Bayesian model for Small Area Estimation (SAE), the Fay-Herriot model with Spectral Clustering (FH-SC). Unlike traditional approaches, clustering in FH-SC is based on spectral clustering algorithms that utilize external covariates, rather than geographical or administrative criteria. A major advantage of the FH-SC model is its flexibility in integrating existing SAE approaches, with or without clustering random effects. To enable benchmarking, we leverage the theoretical framework of posterior projections for constrained Bayesian inference and derive closed form expressions for the new Rao-Blackwell (RB) estimators of the posterior mean under the FH-SC model. Additionally, we introduce a novel measure of uncertainty for the benchmarked estimator, the Conditional Posterior Mean Square Error (CPMSE), which is generalizable to other Bayesian SAE estimators. We conduct model-based and data-based simulation studies to evaluate the frequentist properties of the CPMSE. The proposed methodology is motivated by a real case study involving the estimation of the proportion of households with internet access in the municipalities of Colombia. Finally, we also illustrate the advantages of FH-SC over existing Bayesian and frequentist approaches through our case study.

estimator, fh-sc, fh-sc model, (17 more...)

2512.15643

Country:

Africa > Sub-Saharan Africa (0.14)
South America > Colombia > La Guajira Department > Riohacha (0.04)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry:

Health & Medicine (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Blondel, Mathieu, Sander, Michael E., Vivier-Ardisson, Germain, Liu, Tianlin, Roulet, Vincent

Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction

Autoregressive models (ARMs) currently constitute the dominant paradigm for large language models (LLMs). Energy-based models (EBMs) represent another class of models, which have historically been less prevalent in LLM development, yet naturally characterize the optimal policy in post-training alignment. In this paper, we provide a unified view of these two model classes. Taking the chain rule of probability as a starting point, we establish an explicit bijection between ARMs and EBMs in function space, which we show to correspond to a special case of the soft Bellman equation in maximum entropy reinforcement learning. Building upon this bijection, we derive the equivalence between supervised learning of ARMs and EBMs. Furthermore, we analyze the distillation of EBMs into ARMs by providing theoretical error bounds. Our results provide insights into the ability of ARMs to plan ahead, despite being based on the next-token prediction paradigm.

autoregressive language model, ebm, probability, (14 more...)

2512.15605

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Catalina, Daniel Sánchez, Cantwell, George T.

Model inference for ranking from pairwise comparisons

We consider the problem of ranking objects from noisy pairwise comparisons, for example, ranking tennis players from the outcomes of matches. We follow a standard approach to this problem and assume that each object has an unobserved strength and that the outcome of each comparison depends probabilistically on the strengths of the comparands. However, we do not assume to know a priori how skills affect outcomes. Instead, we present an efficient algorithm for simultaneously inferring both the unobserved strengths and the function that maps strengths to probabilities. Despite this problem being under-constrained, we present experimental evidence that the conclusions of our Bayesian approach are robust to different model specifications. We include several case studies to exemplify the method on real-world data sets.

algorithm, bradley-terry model, kernel, (17 more...)

2512.15269

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

A Bayesian latent class reinforcement learning framework to capture adaptive, feedback-driven travel behaviour

Sfeir, Georges, Hess, Stephane, Hancock, Thomas O., Rodrigues, Filipe, Rad, Jamal Amani, Bliemer, Michiel, Beck, Matthew, Khan, Fayyaz

Many travel decisions involve a degree of experience formation, where individuals learn their preferences over time. At the same time, there is extensive scope for heterogeneity across individual travellers, both in their underlying preferences and in how these evolve. The present paper puts forward a Latent Class Reinforcement Learning (LCRL) model that allows analysts to capture both of these phenomena. We apply the model to a driving simulator dataset and estimate the parameters through Variational Bayes. We identify three distinct classes of individuals that differ markedly in how they adapt their preferences: the first displays context-dependent preferences with context-specific exploitative tendencies; the second follows a persistent exploitative strategy regardless of context; and the third engages in an exploratory strategy combined with context-specific preferences.

participant, probability, unreliable route, (14 more...)

2512.14713

Country:

Asia > Middle East > Jordan (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Iowa (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Lewis, Rebecca M., Feng, Oliver Y., Reeve, Henry W. J., Xu, Min, Samworth, Richard J.

Learning the score under shape constraints

Score estimation has recently emerged as a key modern statistical challenge, due to its pivotal role in generative modelling via diffusion models. Moreover, it is an essential ingredient in a new approach to linear regression via convex $M$-estimation, where the corresponding error densities are projected onto the log-concave class. Motivated by these applications, we study the minimax risk of score estimation with respect to squared $L^2(P_0)$-loss, where $P_0$ denotes an underlying log-concave distribution on $\mathbb{R}$. Such distributions have decreasing score functions, but on its own, this shape constraint is insufficient to guarantee a finite minimax risk. We therefore define subclasses of log-concave densities that capture two fundamental aspects of the estimation problem. First, we establish the crucial impact of tail behaviour on score estimation by determining the minimax rate over a class of log-concave densities whose score function exhibits controlled growth relative to the quantile levels. Second, we explore the interplay between smoothness and log-concavity by considering the class of log-concave densities with a scale restriction and a $(β,L)$-Hölder assumption on the log-density for some $β\in [1,2]$. We show that the minimax risk over this latter class is of order $L^{2/(2β+1)}n^{-β/(2β+1)}$ up to poly-logarithmic factors, where $n$ denotes the sample size. When $β< 2$, this rate is faster than could be obtained under either the shape constraint or the smoothness assumption alone. Our upper bounds are attained by a locally adaptive, multiscale estimator constructed from a uniform confidence band for the score function. This study highlights intriguing differences between the score estimation and density estimation problems over this shape-constrained class.

estimation, score estimation, score function, (17 more...)

2512.14624

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > New Zealand (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Pfeifer, Dániel, Kovács, Edith Alice

Trunc-Opt vine building algorithms

Vine copula models have become highly popular and practical tools for modelling multivariate probability distributions due to their flexibility in modelling different kinds of dependences between the random variables involved. However, their flexibility comes with the drawback of a high-dimensional parameter space. To tackle this problem, truncated vine copulas were introduced by Kurowicka (2010) (Gaussian case) and Brechmann and Czado (2013) (general case). Truncated vine copulas contain conditionally independent pair copulas after the truncation level. So far, in the general case, truncated vine constructing algorithms started from the lowest tree in order to encode the largest dependences in the lower trees. The novelty of this paper starts from the observation that a truncated vine is determined by the first tree after the truncation level (see Kovács and Szántai (2017)). This paper introduces a new score for fitting truncated vines to given data, called the Weight of the truncated vine. Then we propose a completely new methodology for constructing truncated vines. We prove theorems which motivate this new approach. While earlier algorithms did not use conditional independences, we give algorithms for constructing and encoding truncated vines which do exploit them. Finally, we illustrate the algorithms on real datasets and compare the results with well-known methods included in R packages. Our method generally compare favorably to previously known methods.

algorithm, cherry tree, copula, (14 more...)

2512.14399

Country:

Europe > Hungary > Budapest > Budapest (0.04)
North America > United States > Wisconsin (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Kucharský, Šimon, Mishra, Aayush, Habermann, Daniel, Radev, Stefan T., Bürkner, Paul-Christian

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

Amortized Bayesian inference (ABI) offers fast, scalable approximations to posterior densities by training neural surrogates on data simulated from the statistical model. However, ABI methods are highly sensitive to model misspecification: when observed data fall outside the training distribution (generative scope of the statistical models), neural surrogates can behave unpredictably. This makes it a challenge in a model comparison setting, where multiple statistical models are considered, of which at least some are misspecified. Recent work on self-consistency (SC) provides a promising remedy to this issue, accessible even for empirical data (without ground-truth labels). In this work, we investigate how SC can improve amortized model comparison conceptualized in four different ways. Across two synthetic and two real-world case studies, we find that approaches for model comparison that estimate marginal likelihoods through approximate parameter posteriors consistently outperform methods that directly approximate model evidence or posterior model probabilities. SC training improves robustness when the likelihood is available, even under severe model misspecification. The benefits of SC for methods without access of analytic likelihoods are more limited and inconsistent. Our results suggest practical guidance for reliable amortized Bayesian model comparison: prefer parameter posterior-based methods and augment them with SC training on empirical datasets to mitigate extrapolation bias under model misspecification.

likelihood, marginal likelihood, sc training, (13 more...)

2512.14308

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Buckley, Brian, O'Hagan, Adrian, Galligan, Marie

A variational Bayes latent class approach for EHR-based patient phenotyping in R

As regulatory agencies increasingly recognise real-world evidence as a complement to traditional clinical trial data, interest has grown in applying Bayesian methods across both interventional and observational research (Boulanger and Carlin (2021). A central objective in many clinical investigations is the delineation of patient subgroups that exhibit comparable disease-related characteristics (He, Belouali, Patricoski, Lehmann, Ball, Anagnostou, Kreimeyer, and Botsis (2023)). Electronic Health Records (EHR) have become an important resource for such phenotypic analyses (Hripcsak and Albers (2013)). Bayesian approaches to patient phenotyping in clinical observational studies have been limited by the computational challenges associated with applying the Markov Chain Monte Carlo (MCMC) approach to real-world data. Hubbard, Huang, Harton, Oganisian, Choi, Utidjian, Eneli, Bailey, and Chen (2019) proposed a Bayes latent class model that could be used in a general context for observational studies that use EHR data. They consider the common clinical context where gold-standard phenotype information, such as genetic and laboratory data, is not fully available. A general model of this form has high potential applicability for use in clinical decision support across disease areas for both primary and secondary clinical databases. Latent Class Analysis (LCA) is widely used when we want to identify patient phenotypes or subgroups given multivariate data (Lanza and Rhoades (2013)). A challenge in clinical LCA is the prevalence of mixed data, where we may have combinations of continuous, nominal, ordinal and count data.

ehr-based patient, phenotype, variational bayes latent class approach, (9 more...)

2512.14272

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.89)
Health & Medicine > Health Care Technology > Medical Record (0.56)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Tanweer, Sunia, Khasawneh, Firas A.

Unsupervised Learning of Density Estimates with Topological Optimization

Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.

bandwidth, density estimation, dimension, (15 more...)

2512.08895

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Michigan > Ingham County > Lansing (0.04)
North America > United States > Michigan > Ingham County > East Lansing (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)