AITopics | Bayesian Learning

We thank the reviewers for the valuable comments and discussions. Please find our clarifications below. IC model [11,43,45], which also learns unknown edge probability parameters. It is interesting that the reviewer brought up the frequentist versus Bayesian view on OIM-LT. LT model and our work is a frequentist approach for the online setting.

artificial intelligence, machine learning, threshold, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Sampling from Probabilistic Submodular Models

Alkis Gotovos, Hamed Hassani, Andreas Krause

Neural Information Processing SystemsOct-2-2025, 00:57:00 GMT

Submodular and supermodular functions have found wide applicability in machine learning, capturing notions such as diversity and regularity, respectively. These notions have deep consequences for optimization, and the problem of (approximately) optimizing submodular functions has received much attention. However, beyond optimization, these notions allow specifying expressive probabilistic models that can be used to quantify predictive uncertainty via marginal inference. Prominent, well-studied special cases include Ising models and determinan-tal point processes, but the general class of log-submodular and log-supermodular models is much richer and little studied. In this paper, we investigate the use of Markov chain Monte Carlo sampling to perform approximate inference in general log-submodular and log-supermodular models. In particular, we consider a simple Gibbs sampling procedure, and establish two sufficient conditions, the first guaranteeing polynomial-time, and the second fast ( O ( n log n)) mixing. We also evaluate the efficiency of the Gibbs sampler on three examples of such models, and compare against a recently proposed variational approach.

gibbs sampler, inference, submodular function, (13 more...)

Neural Information Processing Systems

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Maximum Likelihood Learning With Arbitrary Treewidth via Fast-Mixing Parameter Sets

Justin Domke

Neural Information Processing SystemsOct-2-2025, 00:53:39 GMT

Neural Information Processing Systems http://nips.cc/

gradient, graphical model, iteration, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)

Add feedback

Reliable training and estimation of variance networks

Nicki Skafte, Martin Jørgensen, Søren Hauberg

Neural Information Processing SystemsOct-2-2025, 00:27:39 GMT

We propose and investigate new complementary methodologies for estimating predictive variance networks in regression neural networks.

artificial intelligence, machine learning, variance, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Statistical Model Criticism using Kernel Two Sample Tests

James R. Lloyd, Zoubin Ghahramani

Neural Information Processing SystemsOct-2-2025, 00:27:15 GMT

Neural Information Processing Systems http://nips.cc/

discrepancy, model criticism, witness function, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
(2 more...)

Add feedback

Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections

Raanan Yehezkel Rohekar, Yaniv Gurwicz, Shami Nisimov, Gal Novik

Neural Information Processing SystemsOct-2-2025, 00:16:19 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

How Does the Pretraining Distribution Shape In-Context Learning? Task Selection, Generalization, and Robustness

Azizian, Waïss, Hasan, Ali

arXiv.org Machine LearningOct-2-2025

The emergence of in-context learning (ICL) in large language models (LLMs) remains poorly understood despite its consistent effectiveness, enabling models to adapt to new tasks from only a handful of examples. To clarify and improve these capabilities, we characterize how the statistical properties of the pretraining distribution (e.g., tail behavior, coverage) shape ICL on numerical tasks. We develop a theoretical framework that unifies task selection and generalization, extending and sharpening earlier results, and show how distributional properties govern sample efficiency, task retrieval, and robustness. To this end, we generalize Bayesian posterior consistency and concentration results to heavy-tailed priors and dependent sequences, better reflecting the structure of LLM pretraining data. We then empirically study how ICL performance varies with the pretraining distribution on challenging tasks such as stochastic differential equations and stochastic processes with memory. Together, these findings suggest that controlling key statistical properties of the pretraining distribution is essential for building ICL-capable and reliable LLMs.

icl, international conference, transformer, (12 more...)

arXiv.org Machine Learning

2510.01163

Country: