AITopics | true posterior

Collaborating Authors

true posterior

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gaussian Mean Field Variational Inference can Overestimate Predictive Variance

Odgers, James, Riegler, Ben, Swaroop, Siddharth, Fortuin, Vincent

arXiv.org Machine LearningJun-25-2026

Mean Field Variational Inference (MFVI) is widely understood to underestimate posterior variance. By analysing conjugate Bayesian Linear Regression (BLR), we show that this characterization is incomplete: while MFVI underestimates the variance in parameter space, it can overestimate the predictive variance compared to the exact posterior. We show that if the MFVI posterior underestimates predictive variances in some directions, it necessarily overestimates them in others. Crucially, this overestimation occurs in directions where the training data concentrates. This leads to the surprising result that, for a test point drawn from the training distribution, MFVI's expected predictive variance exceeds that of the exact posterior. We demonstrate a pathological case of this effect, where the MFVI posterior fails to reduce predictive variance compared to the prior on in distribution data. We connect these results to the Cold Posterior Effect, arguing that varying the temperature can correct this overestimation, yielding predictions closer to those of the exact posterior. We validate our theory on synthetic and real-world regression tasks.

artificial intelligence, machine learning, posterior, (18 more...)

arXiv.org Machine Learning

2606.25745

Country:

Asia (0.28)
Europe > Germany (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

CoLT: The conditional localization test for assessing the accuracy of neural posterior estimates

Neural Information Processing SystemsJun-14-2026, 05:37:38 GMT

We consider the problem of validating whether a neural posterior estimate $q(\theta \mid x)$ is an accurate approximation to the true, unknown true posterior $p(\theta \mid x)$. Existing methods for evaluating the quality of an NPE estimate are largely derived from classifier-based tests or divergence measures, but these suffer from several practical drawbacks. As an alternative, we introduce the *Conditional Localization Test* (**CoLT**), a principled method designed to detect discrepancies between $p(\theta \mid x)$ and $q(\theta \mid x)$ across the full range of conditioning inputs. Rather than relying on exhaustive comparisons or density estimation at every $x$, CoLT learns a localization function that adaptively selects points $\theta_l(x)$ where the neural posterior $q$ deviates most strongly from the true posterior $p$ for that $x$. This approach is particularly advantageous in typical simulation-based inference settings, where only a single draw $\theta \sim p(\theta \mid x)$ from the true posterior is observed for each conditioning input, but where the neural posterior $q(\theta \mid x)$ can be sampled an arbitrary number of times. Our theoretical results establish necessary and sufficient conditions for assessing distributional equality across all $x$, offering both rigorous guarantees and practical scalability. Empirically, we demonstrate that CoLT not only performs better than existing methods at comparing $p$ and $q$, but also pinpoints regions of significant divergence, providing actionable insights for model refinement.

artificial intelligence, posterior, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Stepwise Variational Inference with Vine Copulas

Griesbauer, Elisabeth, Rønneberg, Leiv, Frigessi, Arnoldo, Czado, Claudia, Haff, Ingrid Hobæk

arXiv.org Machine LearningMar-25-2026

We propose stepwise variational inference (VI) with vine copulas: a universal VI procedure that combines vine copulas with a novel stepwise estimation procedure of the variational parameters. Vine copulas consist of a nested sequence of trees built from copulas, where more complex latent dependence can be modeled with increasing number of trees. We propose to estimate the vine copula approximate posterior in a stepwise fashion, tree by tree along the vine structure. Further, we show that the usual backward Kullback-Leibler divergence cannot recover the correct parameters in the vine copula model, thus the evidence lower bound is defined based on the Rényi divergence. Finally, an intuitive stopping criterion for adding further trees to the vine eliminates the need to pre-define a complexity parameter of the variational distribution, as required for most other approaches. Thus, our method interpolates between mean-field VI (MFVI) and full latent dependence. In many applications, in particular sparse Gaussian processes, our method is parsimonious with parameters, while outperforming MFVI.

artificial intelligence, machine learning, posterior, (17 more...)

arXiv.org Machine Learning

2603.22959

Country:

Asia > Middle East > Jordan (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > Germany (0.04)
(3 more...)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

On the Statistical Consistency of Risk-Sensitive Bayesian Decision-Making

Neural Information Processing SystemsFeb-16-2026, 08:40:12 GMT

We compute the convergence rates of the RSVB approximate posterior and the corresponding optimal value.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(4 more...)

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)
Information Technology > Modeling & Simulation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

e3844e186e6eb8736e9f53c0c5889527-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 20:35:53 GMT

Inference networks oftraditional Variational Autoencoders (VAEs) aretypically amortized, resulting in relatively inaccurate posterior approximation compared to instance-wise variational optimization. Recent semi-amortized approaches were proposedtoaddress thisdrawback; however,theiriterativegradient update procedures can be computationally demanding.

artificial intelligence, inference, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

9278abf072b58caf21d48dd670b4c721-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 19:51:12 GMT

approximate posterior, posterior, proposal, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Comments on the main proof strategy

Neural Information Processing SystemsFeb-9-2026, 22:01:06 GMT

We thank the reviewer for the insightful comments on the proof. We will clarify better in the main text notions like "overparamaterise" or "fully trained". We further evaluate the robustness of deep ensembles on a subset of the NNs employed in Section 5.3. Table 1: FGSM and PGD attacks on the network employed in Section 5.2. For deterministic NNs Theorem 1 does not hold.

artificial intelligence, machine learning, posterior, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

AnalyticalProbabilityDistributions andExactExpectation-Maximization forDeepGenerativeNetworks

Neural Information Processing SystemsFeb-9-2026, 18:37:34 GMT

These findings enable us to derive an analytical Expectation-Maximization (EM)algorithm forgradient-free DGNlearning.

artificial intelligence, machine learning, posterior, (18 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

T. (21) Fromtheaboveequation,ker h=span h 0d0 n, Φ(2)

Neural Information Processing SystemsFeb-9-2026, 02:16:01 GMT

The last equation is derived as follows. Inaddition, we set the observation varianceσx to 0.25. Logistic(;µ,s) is the density function of a logistic distribution with the location parameterµand the scale parameters,andσ isthe logistic sigmoid function. Before each activation, we apply the layer normalization [Ba et al., 2016] to stabilize training. When the model has sufficiently high expressive power,b may diverge to infinity [Rezende and Viola, 2018], so we add a regularization term of(b+2ζ( b))/m to the loss function, wherem is the number of training examples.

artificial intelligence, inductive learning, machine learning, (12 more...)

Neural Information Processing Systems

Country: