AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Failure Prediction from Limited Hardware Demonstrations

Parashar, Anjali, Garg, Kunal, Zhang, Joseph, Fan, Chuchu

arXiv.org Artificial IntelligenceOct-11-2024

Prediction of failures in real-world robotic systems either requires accurate model information or extensive testing. Partial knowledge of the system model makes simulation-based failure prediction unreliable. Moreover, obtaining such demonstrations is expensive, and could potentially be risky for the robotic system to repeatedly fail during data collection. This work presents a novel three-step methodology for discovering failures that occur in the true system by using a combination of a limited number of demonstrations from the true system and the failure information processed through sampling-based testing of a model dynamical system. Given a limited budget $N$ of demonstrations from true system and a model dynamics (with potentially large modeling errors), the proposed methodology comprises of a) exhaustive simulations for discovering algorithmic failures using the model dynamics; b) design of initial $N_1$ demonstrations of the true system using Bayesian inference to learn a Gaussian process regression (GPR)-based failure predictor; and c) iterative $N - N_1$ demonstrations of the true system for updating the failure predictor. To illustrate the efficacy of the proposed methodology, we consider: a) the failure discovery for the task of pushing a T block to a fixed target region with UR3E collaborative robot arm using a diffusion policy; and b) the failure discovery for an F1-Tenth racing car tracking a given raceline under an LQR control policy.

artificial intelligence, demonstration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.09249

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Add feedback

pyhgf: A neural network library for predictive coding

Legrand, Nicolas, Weber, Lilian, Waade, Peter Thestrup, Daugaard, Anna Hedvig Møller, Khodadadi, Mojtaba, Mikuš, Nace, Mathys, Chris

arXiv.org Artificial IntelligenceOct-11-2024

Bayesian models of cognition have gained considerable traction in computational neuroscience and psychiatry. Their scopes are now expected to expand rapidly to artificial intelligence, providing general inference frameworks to support embodied, adaptable, and energy-efficient autonomous agents. A central theory in this domain is predictive coding, which posits that learning and behaviour are driven by hierarchical probabilistic inferences about the causes of sensory inputs. Biological realism constrains these networks to rely on simple local computations in the form of precision-weighted predictions and prediction errors. This can make this framework highly efficient, but its implementation comes with unique challenges on the software development side. Embedding such models in standard neural network libraries often becomes limiting, as these libraries' compilation and differentiation backends can force a conceptual separation between optimization algorithms and the systems being optimized. This critically departs from other biological principles such as self-monitoring, self-organisation, cellular growth and functional plasticity. In this paper, we introduce \texttt{pyhgf}: a Python package backed by JAX and Rust for creating, manipulating and sampling dynamic networks for predictive coding. We improve over other frameworks by enclosing the network components as transparent, modular and malleable variables in the message-passing steps. The resulting graphs can implement arbitrary computational complexities as beliefs propagation. But the transparency of core variables can also translate into inference processes that leverage self-organisation principles, and express structure learning, meta-learning or causal discovery as the consequence of network structural adaptation to surprising inputs. The code, tutorials and documentation are hosted at: https://github.com/ilabcode/pyhgf.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.09206

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Austria > Vienna (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > Denmark (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

DFM: Interpolant-free Dual Flow Matching

Gudovskiy, Denis, Okuno, Tomoyuki, Nakata, Yohei

arXiv.org Machine LearningOct-11-2024

Continuous normalizing flows (CNFs) can model data distributions with expressive infinite-length architectures. But this modeling involves computationally expensive process of solving an ordinary differential equation (ODE) during maximum likelihood training. Recently proposed flow matching (FM) framework allows to substantially simplify the training phase using a regression objective with the interpolated forward vector field. In this paper, we propose an interpolant-free dual flow matching (DFM) approach without explicit assumptions about the modeled vector field. DFM optimizes the forward and, additionally, a reverse vector field model using a novel objective that facilitates bijectivity of the forward and reverse transformations. Our experiments with the SMAP unsupervised anomaly detection show advantages of DFM when compared to the CNF trained with either maximum likelihood or FM objectives with the state-of-the-art performance metrics.

differential equation, objective, vector field, (13 more...)

arXiv.org Machine Learning

2410.09246

Country:

North America > United States (0.05)
Asia > Japan (0.05)
Africa > Mali (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Add feedback

Optimal Downsampling for Imbalanced Classification with Generalized Linear Models

Chen, Yan, Blanchet, Jose, Dembczynski, Krzysztof, Nern, Laura Fee, Flores, Aaron

arXiv.org Machine LearningOct-11-2024

Downsampling or under-sampling is a technique that is utilized in the context of large and highly imbalanced classification models. We study optimal downsampling for imbalanced classification using generalized linear models (GLMs). We propose a pseudo maximum likelihood estimator and study its asymptotic normality in the context of increasingly imbalanced populations relative to an increasingly large sample size. We provide theoretical guarantees for the introduced estimator. Additionally, we compute the optimal downsampling rate using a criterion that balances statistical accuracy and computational efficiency. Our numerical experiments, conducted on both synthetic and empirical data, further validate our theoretical results, and demonstrate that the introduced estimator outperforms commonly available alternatives.

estimator, imbalanced classification, maximum likelihood estimator, (10 more...)

arXiv.org Machine Learning

2410.08994

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback

Linear-cost unbiased posterior estimates for crossed effects and matrix factorization models via couplings

Ceriani, Paolo Maria, Zanella, Giacomo

arXiv.org Machine LearningOct-11-2024

In recent years, unbiased Markov Chain Monte Carlo via couplings (UMCMC) has emerged as a promising framework to remove bias from MCMC estimates, thus potentially allowing for early stopping, simplifying the convergence diagnostic process and facilitating parallelization (Glynn and Rhee, 2014; Jacob et al., 2020). In UMCMC, coupled chains are run for a random number of iterations (at least up to coalescence) and their values are combined to produce unbiased estimates. A natural question that arises is whether this class of estimates incurs a greater computational cost than conventional MCMC based on simple ergodic averages and to quantify this potential difference. Framing the question differently, one may ask whether it is possible to devise UMCMC methods with computational cost matching top performing MCMCs, while enjoying the above mentioned benefits. On a different line of research, various works showed how carefully designed blocked Gibbs Samplers (BGSs), i.e. Gibbs sampling schemes that update entire blocks of coordinates jointly, can achieve state-of-the-art performances for sampling from the posterior distributions of various challenging high-dimensional Bayesian models, such as non-nested models with crossed dependencies (Papaspiliopoulos et al., 2019, 2023). In particular, BGSs achieve linear computational costs in the number of parameters and observations in asymptotic regimes where both diverge to infinity.

coupling, iteration, meeting time, (15 more...)

arXiv.org Machine Learning

2410.08939

Country:

North America > United States > Michigan (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Causal machine learning for predicting treatment outcomes

Feuerriegel, Stefan, Frauen, Dennis, Melnychuk, Valentyn, Schweisthal, Jonas, Hess, Konstantin, Curth, Alicia, Bauer, Stefan, Kilbertus, Niki, Kohane, Isaac S., van der Schaar, Mihaela

arXiv.org Machine LearningOct-11-2024

Causal machine learning (ML) offers flexible, data-driven methods for predicting treatment outcomes. Here, we present how methods from causal ML can be used to understand the effectiveness of treatments, thereby supporting the assessment and safety of drugs. A key benefit of causal ML is that allows for estimating individualized treatment effects, as well as personalized predictions of potential patient outcomes under different treatments. This offers granular insights into when treatments are effective, so that decision-making in patient care can be personalized to individual patient profiles. We further discuss how causal ML can be used in combination with both clinical trial data as well as real-world data such as clinical registries and electronic health records. We finally provide recommendations for the reliable use of causal ML in medicine. First published in Nature Medicine, 30, 958-968 (2024) by Springer Nature. Assessing the effectiveness of treatments is crucial to ensure patient safety and personalize patient care. Recent innovations in machine learning (ML) offer new, data-driven methods to estimate treatment effects from data. This branch in ML is commonly referred to as causal ML as it aims to predict a causal quantity, namely, the patient outcomes due to treatment [1]. Causal ML can be used in order to estimate treatment effects from both experimental data obtained through randomized controlled trials (RCTs) and observational data obtained from clinical registries, electronic health records, and other real-world data (RWD) sources to generate clinical evidence. A key strength of causal ML is that it allows to estimate individualized treatment effects, as well as to make personalized predictions of potential patient outcomes under different treatments.

causal ml, estimation, treatment effect, (13 more...)

arXiv.org Machine Learning

doi: 10.1038/s41591-024-02902-1

2410.0877

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
(2 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Deterministic Langevin Monte Carlo with Normalizing Flows for Bayesian Inference

Neural Information Processing SystemsOct-10-2024, 23:14:55 GMT

We propose a general purpose Bayesian inference algorithm for expensive likelihoods, replacing the stochastic term in the Langevin equation with a deterministic density gradient term. The particle density is evaluated from the current particle positions using a Normalizing Flow (NF), which is differentiable and has good generalization properties in high dimensions. We take advantage of NF preconditioning and NF based Metropolis-Hastings updates for a faster convergence. We show on various examples that the method is competitive against state of the art sampling methods.

bayesian inference, deterministic langevin monte carlo, normalizing flow

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.70)

Add feedback

Scalable Bayesian inference of dendritic voltage via spatiotemporal recurrent state space models

Neural Information Processing SystemsOct-10-2024, 21:11:28 GMT

Recent advances in optical voltage sensors have brought us closer to a critical goal in cellular neuroscience: imaging the full spatiotemporal voltage on a dendritic tree. However, current sensors and imaging approaches still face significant limitations in SNR and sampling frequency; therefore statistical denoising and interpolation methods remain critical for understanding single-trial spatiotemporal dendritic voltage dynamics. Previous denoising approaches were either based on an inadequate linear voltage model or scaled poorly to large trees. Here we introduce a scalable fully Bayesian approach. We develop a generative nonlinear model that requires few parameters per compartment of the cell but is nonetheless flexible enough to sample realistic spatiotemporal data.

scalable bayesian inference, spatiotemporal recurrent state space model, voltage, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.44)

Add feedback

Minimum Stein Discrepancy Estimators

Neural Information Processing SystemsOct-10-2024, 19:55:49 GMT

When maximum likelihood estimation is infeasible, one often turns to score matching, contrastive divergence, or minimum probability flow to obtain tractable parameter estimates. We provide a unifying perspective of these techniques as minimum Stein discrepancy estimators, and use this lens to design new diffusion kernel Stein discrepancy (DKSD) and diffusion score matching (DSM) estimators with complementary strengths. We establish the consistency, asymptotic normality, and robustness of DKSD and DSM estimators, then derive stochastic Riemannian gradient descent algorithms for their efficient optimisation. The main strength of our methodology is its flexibility, which allows us to design estimators with desirable properties for specific models at hand by carefully selecting a Stein discrepancy. We illustrate this advantage for several challenging problems for score matching, such as non-smooth, heavy-tailed or light-tailed densities.

minimum stein discrepancy estimator, score matching, strength

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees

Neural Information Processing SystemsOct-10-2024, 19:54:21 GMT

Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy that best fits observed sequences of states and actions implemented by an expert. Many algorithms for IRL have an inherent nested structure: the inner loop finds the optimal policy given parametrized rewards while the outer loop updates the estimates towards optimizing a measure of fit. For high dimensional environments such nested-loop structure entails a significant computational burden. To reduce the computational burden of a nested loop, novel methods such as SQIL \cite{reddy2019sqil} and IQ-Learn \cite{garg2021iq} emphasize policy estimation at the expense of reward estimation accuracy. However, without accurate estimated rewards, it is not possible to do counterfactual analysis such as predicting the optimal policy under different environment dynamics and/or learning new tasks.

finite-time guarantee, maximum-likelihood inverse reinforcement learning, optimal policy, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback