AITopics

2505.11325

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Gouriéroux, Christian, Lu, Yang

A simple estimator of the correlation kernel matrix of a determinantal point process

arXiv.org Machine LearningMay-21-2025

Determinantal Point Process (DPP) is a flexible family of distributions for random sets defined on the finite state space { 1, ...,d }, or equivalently for multivariate binary variables. This family is parameterized by either the L-ensemble kernel Σ, which is symmetric positive definite (SPD), or the correlation kernel matrix K, which is SPD, with eigenvalues lying strictly between 0 and 1. The literature has considered the maximum likelihood estimation (MLE) of Σ and K or its algorithmic analogues (Affandi et al., 2014; Brunel et al., 2017a,b), but it has since been shown that i) the likelihood function has at least 2

artificial intelligence, estimator, machine learning, (17 more...)

2505.14529

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Nam, Andrew, Campbell, Declan, Griffiths, Thomas, Cohen, Jonathan, Leslie, Sarah-Jane

Understanding Task Representations in Neural Networks via Bayesian Ablation

arXiv.org Artificial IntelligenceMay-21-2025

Neural networks are powerful tools for cognitive modeling due to their flexibility and emergent properties. However, interpreting their learned representations remains challenging due to their sub-symbolic semantics. In this work, we introduce a novel probabilistic framework for interpreting latent task representations in neural networks. Inspired by Bayesian inference, our approach defines a distribution over representational units to infer their causal contributions to task performance. Using ideas from information theory, we propose a suite of tools and metrics to illuminate key model properties, including representational distributedness, manifold complexity, and polysemanticity.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

2505.13742

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

The Gaussian Latent Machine: Efficient Prior and Posterior Sampling for Inverse Problems

Kuric, Muhamed, Zach, Martin, Habring, Andreas, Unser, Michael, Pock, Thomas

We consider the problem of sampling from a product-of-experts-type model that encompasses many standard prior and posterior distributions commonly found in Bayesian imaging. We show that this model can be easily lifted into a novel latent variable model, which we refer to as a Gaussian latent machine. This leads to a general sampling approach that unifies and generalizes many existing sampling algorithms in the literature. Most notably, it yields a highly efficient and effective two-block Gibbs sampling approach in the general case, while also specializing to direct sampling algorithms in particular cases. Finally, we present detailed numerical experiments that demonstrate the efficiency and effectiveness of our proposed sampling approach across a wide range of prior and posterior sampling problems from Bayesian imaging.

artificial intelligence, experiment, machine learning, (20 more...)

2505.12836

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Kilian, Valentin, Guedj, Benjamin, Caron, François

Rapidly Varying Completely Random Measures for Modeling Extremely Sparse Networks

Completely random measures (CRMs) are fundamental to Bayesian nonparametric models, with applications in clustering, feature allocation, and network analysis. A key quantity of interest is the Laplace exponent, whose asymptotic behavior determines how the random structures scale. When the Laplace exponent grows nearly linearly - known as rapid variation - the induced models exhibit approximately linear growth in the number of clusters, features, or edges with sample size or network nodes. This regime is especially relevant for modeling sparse networks, yet existing CRM constructions lack tractability under rapid variation. We address this by introducing a new class of CRMs with index of variation $α\in(0,1]$, defined as mixtures of stable or generalized gamma processes. These models offer interpretable parameters, include well-known CRMs as limiting cases, and retain analytical tractability through a tractable Laplace exponent and simple size-biased representation. We analyze the asymptotic properties of this CRM class and apply it to the Caron-Fox framework for sparse graphs. The resulting models produce networks with near-linear edge growth, aligning with empirical evidence from large-scale networks. Additionally, we present efficient algorithms for simulation and posterior inference, demonstrating practical advantages through experiments on real-world sparse network datasets.

artificial intelligence, machine learning, social media, (19 more...)

2505.13206

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications > Social Media (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Zampetakis, Manolis, Zhou, Felix

Private Statistical Estimation via Truncation

We introduce a novel framework for differentially private (DP) statistical estimation via data truncation, addressing a key challenge in DP estimation when the data support is unbounded. Traditional approaches rely on problem-specific sensitivity analysis, limiting their applicability. By leveraging techniques from truncated statistics, we develop computationally efficient DP estimators for exponential family distributions, including Gaussian mean and covariance estimation, achieving near-optimal sample complexity. Previous works on exponential families only consider bounded or one-dimensional families. Our approach mitigates sensitivity through truncation while carefully correcting for the introduced bias using maximum likelihood estimation and DP stochastic gradient descent. Along the way, we establish improved uniform convergence guarantees for the log-likelihood function of exponential families, which may be of independent interest. Our results provide a general blueprint for DP algorithm design via truncated statistics.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2505.12541

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(15 more...)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Theory: Multidimensional Space of Events

Kavun, Sergii

This paper extends Bayesian probability theory by developing a multidimensional space of events (MDSE) theory that accounts for mutual influences between events and hypotheses sets. While traditional Bayesian approaches assume conditional independence between certain variables, real-world systems often exhibit complex interdependencies that limit classical model applicability. Building on established probabilistic foundations, our approach introduces a mathematical formalism for modeling these complex relationships. We developed the MDSE theory through rigorous mathematical derivation and validated it using three complementary methodologies: analytical proofs, computational simulations, and case studies drawn from diverse domains. Results demonstrate that MDSE successfully models complex dependencies with 15-20% improved prediction accuracy compared to standard Bayesian methods when applied to datasets with high interdimensionality. This theory particularly excels in scenarios with over 50 interrelated variables, where traditional methods show exponential computational complexity growth while MDSE maintains polynomial scaling. Our findings indicate that MDSE provides a viable mathematical foundation for extending Bayesian reasoning to complex systems while maintaining computational tractability. This approach offers practical applications in engineering challenges including risk assessment, resource optimization, and forecasting problems where multiple interdependent factors must be simultaneously considered.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2505.11566

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Banking & Finance > Economy (0.68)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Millard, Andrew, Zhao, Zheng, Murphy, Joshua, Maskell, Simon

Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles

Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) proposals into SMC, enabling efficient mini-batch based sampling. Our resulting SMCSGHMC algorithm outperforms standard stochastic gradient descent (SGD) and deep ensembles across image classification, out-of-distribution (OOD) detection, and transfer learning tasks. We further show that SMCSGHMC mitigates overfitting and improves calibration, providing a flexible, scalable pathway for converting pretrained neural networks into well-calibrated Bayesian models.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2505.11671

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Candelieri, Antonio, Ponti, Andrea, Archetti, Francesco

Wasserstein Barycenter Gaussian Process based Bayesian Optimization

Gaussian Process based Bayesian Optimization is a widely applied algorithm to learn and optimize under uncertainty, well-known for its sample efficiency. However, recently -- and more frequently -- research studies have empirically demonstrated that the Gaussian Process fitting procedure at its core could be its most relevant weakness. Fitting a Gaussian Process means tuning its kernel's hyperparameters to a set of observations, but the common Maximum Likelihood Estimation technique, usually appropriate for learning tasks, has shown different criticalities in Bayesian Optimization, making theoretical analysis of this algorithm an open challenge. Exploiting the analogy between Gaussian Processes and Gaussian Distributions, we present a new approach which uses a prefixed set of hyperparameters values to fit as many Gaussian Processes and then combines them into a unique model as a Wasserstein Barycenter of Gaussian Processes. We considered both "easy" test problems and others known to undermine the \textit{vanilla} Bayesian Optimization algorithm. The new method, namely Wasserstein Barycenter Gausssian Process based Bayesian Optimization (WBGP-BO), resulted promising and able to converge to the optimum, contrary to vanilla Bayesian Optimization, also on the most "tricky" test problems.

artificial intelligence, machine learning, optimization, (15 more...)

2505.12471

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)

Attribution Projection Calculus: A Novel Framework for Causal Inference in Bayesian Networks

Amin, M Ruhul

This paper introduces Attribution Projection Calculus (AP-Calculus), a novel mathematical framework for determining causal relationships in structured Bayesian networks. We investigate a specific network architecture with source nodes connected to destination nodes through intermediate nodes, where each input maps to a single label with maximum marginal probability. We prove that for each label, exactly one intermediate node acts as a deconfounder while others serve as confounders, enabling optimal attribution of features to their corresponding labels. The framework formalizes the dual nature of intermediate nodes as both confounders and deconfounders depending on the context, and establishes separation functions that maximize distinctions between intermediate representations. We demonstrate that the proposed network architecture is optimal for causal inference compared to alternative structures, including those based on Pearl's causal framework. AP-Calculus provides a comprehensive mathematical foundation for analyzing feature-label attributions, managing spurious correlations, quantifying information gain, ensuring fairness, and evaluating uncertainty in prediction models, including large language models. Theoretical verification shows that AP-Calculus not only extends but can also subsume traditional do-calculus for many practical applications, offering a more direct approach to causal inference in supervised learning contexts.

ap-calculus, artificial intelligence, machine learning, (16 more...)

2505.12094

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)