AITopics

2510.22063

Country: North America > United States (0.48)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

arXiv.org Machine LearningOct-28-2025

Differentiable Constraint-Based Causal Discovery

Zhou, Jincheng, Wang, Mengbo, He, Anqi, Zhou, Yumeng, Olya, Hessam, Kocaoglu, Murat, Ribeiro, Bruno

Causal discovery from observational data is a fundamental task in artificial intelligence, with far-reaching implications for decision-making, predictions, and interventions. Despite significant advances, existing methods can be broadly categorized as constraint-based or score-based approaches. Constraint-based methods offer rigorous causal discovery but are often hindered by small sample sizes, while score-based methods provide flexible optimization but typically forgo explicit conditional independence testing. This work explores a third avenue: developing differentiable $d$-separation scores, obtained through a percolation theory using soft logic. This enables the implementation of a new type of causal discovery method: gradient-based optimization of conditional independence constraints. Empirical evaluations demonstrate the robust performance of our approach in low-sample regimes, surpassing traditional constraint-based and score-based baselines on a real-world dataset. Code and data of the proposed method are publicly available at https://github$.$com/PurdueMINDS/DAGPA.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2510.22031

Country: North America > United States (0.92)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Silveri, Marta Gentiloni, Conforti, Giovanni, Durmus, Alain

Exponential Convergence Guarantees for Iterative Markovian Fitting

The Schrödinger Bridge (SB) problem has become a fundamental tool in computational optimal transport and generative modeling. To address this problem, ideal methods such as Iterative Proportional Fitting and Iterative Markovian Fitting (IMF) have been proposed-alongside practical approximations like Diffusion Schrödinger Bridge and its Matching (DSBM) variant. While previous work have established asymptotic convergence guarantees for IMF, a quantitative, non-asymptotic understanding remains unknown. In this paper, we provide the first non-asymptotic exponential convergence guarantees for IMF under mild structural assumptions on the reference measure and marginal distributions, assuming a sufficiently large time horizon. Our results encompass two key regimes: one where the marginals are log-concave, and another where they are weakly log-concave. The analysis relies on new contraction results for the Markovian projection operator and paves the way to theoretical guarantees for DSBM.

artificial intelligence, machine learning, natural language, (14 more...)

2510.20871

Country:

Asia > Middle East > Jordan (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Boeken, Philip, Zoeter, Onno, Mooij, Joris M.

Conditional Forecasts and Proper Scoring Rules for Reliable and Accurate Performative Predictions

Performative predictions are forecasts which influence the outcomes they aim to predict, undermining the existence of correct forecasts and standard methods of elicitation and estimation. We show that conditioning forecasts on covariates that separate them from the outcome renders the target distribution forecast-invariant, guaranteeing well-posedness of the forecasting problem. However, even under this condition, classical proper scoring rules fail to elicit correct forecasts. We prove a general impossibility result and identify two solutions: (i) in decision-theoretic settings, elicitation of correct and incentive-compatible forecasts is possible if forecasts are separating; (ii) scoring with unbiased estimates of the divergence between the forecast and the induced distribution of the target variable yields correct forecasts. Applying these insights to parameter estimation, conditional forecasts and proper scoring rules enable performatively stable estimation of performatively correct parameters, resolving the issues raised by Perdomo et al. (2020). Our results expose fundamental limits of classical forecast evaluation and offer new tools for reliable and accurate forecasting in performative settings.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2510.21335

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference

Zhao, Stephen, Li, Aidan, Brekelmans, Rob, Grosse, Roger

Reinforcement learning (RL) has become a predominant technique to align language models (LMs) with human preferences or promote outputs which are deemed to be desirable by a given reward function. Standard RL approaches optimize average reward, while methods explicitly focused on reducing the probability of undesired outputs typically come at a cost to average-case performance. To improve this tradeoff, we introduce RePULSe, a new training method that augments the standard RL loss with an additional loss that uses learned proposals to guide sampling low-reward outputs, and then reduces those outputs' probability. We run experiments demonstrating that RePULSe produces a better tradeoff of expected reward versus the probability of undesired outputs and is more adversarially robust, compared to standard RL alignment approaches and alternatives.

large language model, machine learning, natural language, (18 more...)

2510.21184

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Portugal (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government (1.00)
Banking & Finance (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.82)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.68)

Medvedeva, Arina, Balzani, Edoardo, Williams, Alex H, Keeley, Stephen L

Scalable inference of functional neural connectivity at submillisecond timescales

The Poisson Generalized Linear Model (GLM) is a foundational tool for analyzing neural spike train data. However, standard implementations rely on discretizing spike times into binned count data, limiting temporal resolution and scalability. Here, we develop Monte Carlo (MC) methods and polynomial approximations (PA) to the continuous-time analog of these models, and show them to be advantageous over their discrete-time counterparts. Further, we propose using a set of exponentially scaled Laguerre polynomials as an orthogonal temporal basis, which improves filter identification and yields closed-form integral solutions under the polynomial approximation. Applied to both synthetic and real spike-time data from rodent hippocampus, our methods demonstrate superior accuracy and scalability compared to traditional binned GLMs, enabling functional connectivity inference in large-scale neural recordings that are temporally precise on the order of synaptic dynamical timescales and in agreement with known anatomical properties of hippocampal subregions. We provide open-source implementations of both MC and PA estimators, optimized for GPU acceleration, to facilitate adoption in the neuroscience community.

approximation, artificial intelligence, machine learning, (16 more...)

2510.20966

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Dheur, Victor, Taieb, Souhaib Ben

Multivariate Latent Recalibration for Conditional Normalizing Flows

Reliably characterizing the full conditional distribution of a multivariate response variable given a set of covariates is crucial for trustworthy decision-making. However, misspecified or miscalibrated multivariate models may yield a poor approximation of the joint distribution of the response variables, leading to unreliable predictions and suboptimal decisions. Furthermore, standard recalibration methods are primarily limited to univariate settings, while conformal prediction techniques, despite generating multivariate prediction regions with coverage guarantees, do not provide a full probability density function. We address this gap by first introducing a novel notion of latent calibration, which assesses probabilistic calibration in the latent space of a conditional normalizing flow. Second, we propose latent recalibration (LR), a novel post-hoc model recalibration method that learns a transformation of the latent space with finite-sample bounds on latent calibration. Unlike existing methods, LR produces a recalibrated distribution with an explicit multivariate density function while remaining computationally efficient. Extensive experiments on both tabular and image datasets show that LR consistently improves latent calibration error and the negative log-likelihood of the recalibrated models.

artificial intelligence, calibration, machine learning, (16 more...)

2505.16636

Country:

North America > United States (0.45)
Asia > Indonesia (0.45)
Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry: Energy (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Large Language Bayes

Domke, Justin

Many domain experts do not have the time or expertise to write formal Bayesian models. This paper takes an informal problem description as input, and combines a large language model and a probabilistic programming language to define a joint distribution over formal models, latent variables, and data. A posterior over latent variables follows by conditioning on observed data and integrating over formal models. This presents a challenging inference problem. We suggest an inference recipe that amounts to generating many formal models from the large language model, performing approximate inference on each, and then doing a weighted average. This is justified and analyzed as a combination of self-normalized importance sampling, MCMC, and importance-weighted variational inference. Experimentally, this produces sensible predictions from only data and an informal problem description, without the need to specify a formal model.

large language model, machine learning, posterior, (21 more...)

2504.14025

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
(2 more...)

Kanaujia, Vikas, Arora, Vipul

SCORENF: Score-based Normalizing Flows for Sampling Unnormalized distributions

Unnormalized probability distributions are central to modeling complex physical systems across various scientific domains. Traditional sampling methods, such as Markov Chain Monte Carlo (MCMC), often suffer from slow convergence, critical slowing down, poor mode mixing, and high autocorrelation. In contrast, likelihood-based and adversarial machine learning models, though effective, are heavily data-driven, requiring large datasets and often encountering mode covering and mode collapse. In this work, we propose ScoreNF, a score-based learning framework built on the Normalizing Flow (NF) architecture, integrated with an Independent Metropolis-Hastings (IMH) module, enabling efficient and unbiased sampling from unnormalized target distributions. We show that ScoreNF maintains high performance even with small training ensembles, thereby reducing reliance on computationally expensive MCMC-generated training data. We also present a method for assessing mode-covering and mode-collapse behaviours. We validate our method on synthetic 2D distributions (MOG-4 and MOG-8) and the high-dimensional $ϕ^4$ lattice field theory distribution, demonstrating its effectiveness for sampling tasks.

artificial intelligence, machine learning, target distribution, (15 more...)

2510.2133

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought

Yan, Cheng, Mohr, Felix, Viering, Tom

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i.e. improving with more data) and convex. By constructing the Learning Curves Database 1.1 (LCDB 1.1), a large-scale database with high-resolution learning curves including more modern learners (CatBoost, TabNet, RealMLP and TabPFN), we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 15% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1.1 as a challenging benchmark for future research.

artificial intelligence, lcdb 1, machine learning, (17 more...)

2505.15657

Country:

Europe (0.46)
North America (0.28)
Asia (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.92)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
(2 more...)