AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Kucharský, Šimon, Mishra, Aayush, Habermann, Daniel, Radev, Stefan T., Bürkner, Paul-Christian

Improving the Accuracy of Amortized Model Comparison with Self-Consistency

arXiv.org Machine LearningDec-17-2025

Amortized Bayesian inference (ABI) offers fast, scalable approximations to posterior densities by training neural surrogates on data simulated from the statistical model. However, ABI methods are highly sensitive to model misspecification: when observed data fall outside the training distribution (generative scope of the statistical models), neural surrogates can behave unpredictably. This makes it a challenge in a model comparison setting, where multiple statistical models are considered, of which at least some are misspecified. Recent work on self-consistency (SC) provides a promising remedy to this issue, accessible even for empirical data (without ground-truth labels). In this work, we investigate how SC can improve amortized model comparison conceptualized in four different ways. Across two synthetic and two real-world case studies, we find that approaches for model comparison that estimate marginal likelihoods through approximate parameter posteriors consistently outperform methods that directly approximate model evidence or posterior model probabilities. SC training improves robustness when the likelihood is available, even under severe model misspecification. The benefits of SC for methods without access of analytic likelihoods are more limited and inconsistent. Our results suggest practical guidance for reliable amortized Bayesian model comparison: prefer parameter posterior-based methods and augment them with SC training on empirical datasets to mitigate extrapolation bias under model misspecification.

likelihood, marginal likelihood, sc training, (13 more...)

2512.14308

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Volkmann, Björn, Ewering, Jan-Hendrik, Meindl, Michael, Ehlers, Simon F. G., Seel, Thomas

Bayesian Inference and Learning in Nonlinear Dynamical Systems: A Framework for Incorporating Explicit and Implicit Prior Knowledge

arXiv.org Machine LearningAug-22-2025

Accuracy and generalization capabilities are key objectives when learning dynamical system models. To obtain such models from limited data, current works exploit prior knowledge and assumptions about the system. However, the fusion of diverse prior knowledge, e. g. partially known system equations and smoothness assumptions about unknown model parts, with information contained in the data remains a challenging problem, especially in input-output settings with latent system state. In particular, learning functions that are nested inside known system equations can be a laborious and error-prone expert task. This paper considers inference of latent states and learning of unknown model parts for fusion of data information with different sources of prior knowledge. The main contribution is a general-purpose system identification tool that, for the first time, provides a consistent solution for both, online and offline Bayesian inference and learning while allowing to incorporate explicit and implicit prior system knowledge. We propose a novel interface for combining known dynamics functions with a learning-based approximation of unknown system parts. Based on the proposed model structure, closed-form densities for efficient parameter marginalization are derived. No user-tailored coordinate transformations or model inversions are needed, making the presented framework a general-purpose tool for inference and learning. The broad applicability of the devised framework is illustrated in three distinct case studies, including an experimental data set.

artificial intelligence, knowledge, machine learning, (16 more...)

2508.15345

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Neural Information Processing SystemsMay-27-2025, 01:57:25 GMT

Should We Learn Most Likely Functions or Parameters?

Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generally correspond to the most likely function induced by the parameter posterior. In fact, we can re-parametrize a model such that any setting of parameters can maximize the parameter posterior. As an alternative, we investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data. We show that this procedure leads to pathological solutions when using neural networks and prove conditions under which the procedure is well-behaved, as well as a scalable approximation.

estimation, likely function, parameter posterior, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Neural Information Processing SystemsJan-19-2025, 05:31:57 GMT

Should We Learn Most Likely Functions or Parameters?

Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generally correspond to the most likely function induced by the parameter posterior. In fact, we can re-parametrize a model such that any setting of parameters can maximize the parameter posterior. As an alternative, we investigate the benefits and drawbacks of directly estimating the most likely function implied by the model and the data. We show that this procedure leads to pathological solutions when using neural networks and prove conditions under which the procedure is well-behaved, as well as a scalable approximation.

estimation, likely function, parameter posterior, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Kucharský, Šimon, Bürkner, Paul Christian

Amortized Bayesian Mixture Models

arXiv.org Machine LearningJan-17-2025

Finite mixtures are a broad class of models useful in scenarios where observed data is generated by multiple distinct processes but without explicit information about the responsible process for each data point. Estimating Bayesian mixture models is computationally challenging due to issues such as high-dimensional posterior inference and label switching. Furthermore, traditional methods such as MCMC are applicable only if the likelihoods for each mixture component are analytically tractable. Amortized Bayesian Inference (ABI) is a simulation-based framework for estimating Bayesian models using generative neural networks. This allows the fitting of models without explicit likelihoods, and provides fast inference. ABI is therefore an attractive framework for estimating mixture models. This paper introduces a novel extension of ABI tailored to mixture models. We factorize the posterior into a distribution of the parameters and a distribution of (categorical) mixture indicators, which allows us to use a combination of generative neural networks for parameter inference, and classification networks for mixture membership identification. The proposed framework accommodates both independent and dependent mixture models, enabling filtering and smoothing. We validate and demonstrate our approach through synthetic and real-world datasets.

artificial intelligence, machine learning, mixture model, (18 more...)

2501.10229

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.46)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine (0.46)

Neural Information Processing SystemsOct-8-2024, 04:33:15 GMT

Reviews: Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models

Summary The paper presents a method for assessing the uncertainty in the evaluation of an expectation over the output of a complex simulation model given uncertainty in the model parameters. Such simulation models take a long time to solve, given a set of parameters, so the task of averaging over the outputs of the simulation given uncertainty in the parameters is challenging. One cannot simply run the model so many times that error in the estimate of the integral is controlled. The authors approach the problem as an inference task. Given samples from the parameter posterior one must infer the posterior over the integral of interest.

functional cardiac model, model, probabilistic model, (10 more...)

Technology:

Information Technology > Modeling & Simulation (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.40)

Turishcheva, Polina, Ramapuram, Jason, Williamson, Sinead, Busbridge, Dan, Dhekane, Eeshan, Webb, Russ

Bootstrap Your Own Variance

arXiv.org Machine LearningDec-5-2023

Understanding model uncertainty is important for many applications. We propose Bootstrap Your Own Variance (BYOV), combining Bootstrap Your Own Latent (BYOL), a negative-free Self-Supervised Learning (SSL) algorithm, with Bayes by Backprop (BBB), a Bayesian method for estimating model posteriors. We find that the learned predictive std of BYOV vs. a supervised BBB model is well captured by a Gaussian distribution, providing preliminary evidence that the learned parameter posterior is useful for label free uncertainty estimation. BYOV improves upon the deterministic BYOL baseline (+2.83% test ECE, +1.03% test Brier) and presents better calibration and reliability when tested with various augmentations (eg: +2.4% test ECE, +1.2% test Brier for Salt & Pepper noise).

artificial intelligence, international conference, machine learning, (12 more...)

2312.03213

Country:

Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > Dominican Republic > Samaná > Samaná (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Schröder, Cornelius, Macke, Jakob H.

Simultaneous identification of models and parameters of scientific simulators

arXiv.org Artificial IntelligenceMay-24-2023

Many scientific models are composed of multiple discrete components, and scien tists often make heuristic decisions about which components to include. Bayesian inference provides a mathematical framework for systematically selecting model components, but defining prior distributions over model components and developing associated inference schemes has been challenging. We approach this problem in an amortized simulation-based inference framework: We define implicit model priors over a fixed set of candidate components and train neural networks to infer joint probability distributions over both, model components and associated parameters from simulations. To represent distributions over model components, we introduce a conditional mixture of multivariate binary distributions in the Grassmann formalism. Our approach can be applied to any compositional stochastic simulator without requiring access to likelihood evaluations. We first illustrate our method on a simple time series model with redundant components and show that it can retrieve joint posterior distribution over a set of symbolic expressions and their parameters while accurately capturing redundancy with strongly correlated posteriors. We then apply our approach to drift-diffusion models, a commonly used model class in cognitive neuroscience. After validating the method on synthetic data, we show that our approach explains experimental data as well as previous methods, but that our fully probabilistic approach can help to discover multiple data-consistent model configurations, as well as reveal non-identifiable model components and parameters. Our method provides a powerful tool for data-driven scientific inquiry which will allow scientists to systematically identify essential model components and make uncertainty-informed modelling decisions.

artificial intelligence, machine learning, posterior, (16 more...)

arXiv.org Artificial Intelligence

2305.15174

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)