AITopics

2504.12585

Country:

Asia (0.93)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Gruhlke, Robert, Hanu, Matei, Schillings, Claudia, Wacker, Philipp

Gradient-Free Sequential Bayesian Experimental Design via Interacting Particle Systems

arXiv.org Machine LearningApr-17-2025

We introduce a gradient-free framework for Bayesian Optimal Experimental Design (BOED) in sequential settings, aimed at complex systems where gradient information is unavailable. Our method combines Ensemble Kalman Inversion (EKI) for design optimization with the Affine-Invariant Langevin Dynamics (ALDI) sampler for efficient posterior sampling--both of which are derivative-free and ensemble-based. To address the computational challenges posed by nested expectations in BOED, we propose variational Gaussian and parametrized Laplace approximations that provide tractable upper and lower bounds on the Expected Information Gain (EIG). These approximations enable scalable utility estimation in high-dimensional spaces and PDE-constrained inverse problems. We demonstrate the performance of our framework through numerical experiments ranging from linear Gaussian models to PDE-based inference tasks, highlighting the method's robustness, accuracy, and efficiency in information-driven experimental design.

approximation, artificial intelligence, machine learning, (16 more...)

2504.1332

Country:

Oceania > New Zealand > South Island > Canterbury Region > Christchurch (0.04)
North America > United States > New York (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Lochab, Anamika, Zhang, Ruqi

Energy-Based Reward Models for Robust Language Model Alignment

arXiv.org Machine LearningApr-17-2025

Reward models (RMs) are essential for aligning Large Language Models (LLMs) with human preferences. However, they often struggle with capturing complex human preferences and generalizing to unseen data. To address these challenges, we introduce Energy-Based Reward Model (EBRM), a lightweight post-hoc refinement framework that enhances RM robustness and generalization. EBRM models the reward distribution explicitly, capturing uncertainty in human preferences and mitigating the impact of noisy or misaligned annotations. It achieves this through conflict-aware data filtering, label-noise-aware contrastive training, and hybrid initialization. Notably, EBRM enhances RMs without retraining, making it computationally efficient and adaptable across different models and tasks. Empirical evaluations on RM benchmarks demonstrate significant improvements in both robustness and generalization, achieving up to a 5.97% improvement in safety-critical alignment tasks compared to standard RMs. Furthermore, reinforcement learning experiments confirm that our refined rewards enhance alignment quality, effectively delaying reward hacking. These results demonstrate our approach as a scalable and effective enhancement for existing RMs and alignment pipelines. The code is available at EBRM.

large language model, machine learning, natural language, (21 more...)

2504.13134

Country:

North America > United States (0.04)
Europe > Italy > Sardinia (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Air (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Consumer Health (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Shaposhnyk, Olha, Abid, Noor, Zakir, Mouri, Yanushkevich, Svetlana

Probabilistic causal graphs as categorical data synthesizers: Do they do better than Gaussian Copulas and Conditional Tabular GANs?

arXiv.org Artificial IntelligenceApr-17-2025

This study investigates the generation of high-quality synthetic categorical data, such as survey data, using causal graph models. Generating synthetic data aims not only to create a variety of data for training the models but also to preserve privacy while capturing relationships between the data. The research employs Structural Equation Modeling (SEM) followed by Bayesian Networks (BN). We used the categorical data that are based on the survey of accessibility to services for people with disabilities. We created both SEM and BN models to represent causal relationships and to capture joint distributions between variables. In our case studies, such variables include, in particular, demographics, types of disability, types of accessibility barriers and frequencies of encountering those barriers. The study compared the SEM-based BN method with alternative approaches, including the probabilistic Gaussian copula technique and generative models like the Conditional Tabular Generative Adversarial Network (CTGAN). The proposed method outperformed others in statistical metrics, including the Chi-square test, Kullback-Leibler divergence, and Total Variation Distance (TVD). In particular, the BN model demonstrated superior performance, achieving the highest TVD, indicating alignment with the original data. The Gaussian Copula ranked second, while CTGAN exhibited moderate performance. These analyses confirmed the ability of the SEM-based BN to produce synthetic data that maintain statistical and relational validity while maintaining confidentiality. This approach is particularly beneficial for research on sensitive data, such as accessibility and disability studies.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2504.11547

Country: North America > Canada (0.47)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Padilla, Carlos Misael Madrid, Fan, Shitao, Lin, Lizhen

Robust and Scalable Variational Bayes

arXiv.org Machine LearningApr-16-2025

We propose a robust and scalable framework for variational Bayes (VB) that effectively handles outliers and contamination of arbitrary nature in large datasets. Our approach divides the dataset into disjoint subsets, computes the posterior for each subset, and applies VB approximation independently to these posteriors. The resulting variational posteriors with respect to the subsets are then aggregated using the geometric median of probability measures, computed with respect to the Wasserstein distance. This novel aggregation method yields the Variational Median Posterior (VM-Posterior) distribution. We rigorously demonstrate that the VM-Posterior preserves contraction properties akin to those of the true posterior, while accounting for approximation errors or the variational gap inherent in VB methods. We also provide provable robustness guarantee of the VM-Posterior. Furthermore, we establish a variational Bernstein-von Mises theorem for both multivariate Gaussian distributions with general covariance structures and the mean-field variational family. To facilitate practical implementation, we adapt existing algorithms for computing the VM-Posterior and evaluate its performance through extensive numerical experiments. The results highlight its robustness and scalability, making it a reliable tool for Bayesian inference in the presence of complex, contaminated datasets.

machine learning, natural language, posterior, (18 more...)

2504.12528

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > Spain > Galicia > Madrid (0.05)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Mazaheri, Bijan, Zhang, Jiaqi, Uhler, Caroline

Meta-Dependence in Conditional Independence Testing

arXiv.org Machine LearningApr-16-2025

Constraint-based causal discovery algorithms utilize many statistical tests for conditional independence to uncover networks of causal dependencies. These approaches to causal discovery rely on an assumed correspondence between the graphical properties of a causal structure and the conditional independence properties of observed variables, known as the causal Markov condition and faithfulness. Finite data yields an empirical distribution that is "close" to the actual distribution. Across these many possible empirical distributions, the correspondence to the graphical properties can break down for different conditional independencies, and multiple violations can occur at the same time. We study this "meta-dependence" between conditional independence properties using the following geometric intuition: each conditional independence property constrains the space of possible joint distributions to a manifold. The "meta-dependence" between conditional independences is informed by the position of these manifolds relative to the true probability distribution. We provide a simple-to-compute measure of this meta-dependence using information projections and consolidate our findings empirically using both synthetic and real-world data.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2504.12594

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Hampshire > Grafton County > Hanover (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Cheraghi, Radin, Mahfoozi, Amir Mohammad, Zolfaghari, Sepehr, Shabani, Mohammadshayan, Ramezani, Maryam, Rabiee, Hamid R.

Epistemic Uncertainty-aware Recommendation Systems via Bayesian Deep Ensemble Learning

arXiv.org Artificial IntelligenceApr-16-2025

Recommending items to users has long been a fundamental task, and studies have tried to improve it ever since. Most well-known models commonly employ representation learning to map users and items into a unified embedding space for matching assessment. These approaches have primary limitations, especially when dealing with explicit feedback and sparse data contexts. Two primary limitations are their proneness to overfitting and failure to incorporate epistemic uncertainty in predictions. To address these problems, we propose a novel Bayesian Deep Ensemble Collaborative Filtering method named BDECF. To improve model generalization and quality, we utilize Bayesian Neural Networks, which incorporate uncertainty within their weight parameters. In addition, we introduce a new interpretable non-linear matching approach for the user and item embeddings, leveraging the advantages of the attention mechanism. Furthermore, we endorse the implementation of an ensemble-based supermodel to generate more robust and reliable predictions, resulting in a more complete model. Empirical evaluation through extensive experiments and ablation studies across a range of publicly accessible real-world datasets with differing sparsity characteristics confirms our proposed method's effectiveness and the importance of its components.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2504.10753

Country: Asia > Middle East > Iran (0.15)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceApr-16-2025

Who is More Bayesian: Humans or ChatGPT?

Mu, Tianshi, Rawat, Pranjal, Rust, John, Zhang, Chengjun, Zhong, Qixuan

We compare the performance of human and artificially intelligent (AI) decision makers in simple binary classification tasks where the optimal decision rule is given by Bayes Rule. We reanalyze choices of human subjects gathered from laboratory experiments conducted by El-Gamal and Grether and Holt and Smith. We confirm that while overall, Bayes Rule represents the single best model for predicting human choices, subjects are heterogeneous and a significant share of them make suboptimal choices that reflect judgement biases described by Kahneman and Tversky that include the ``representativeness heuristic'' (excessive weight on the evidence from the sample relative to the prior) and ``conservatism'' (excessive weight on the prior relative to the sample). We compare the performance of AI subjects gathered from recent versions of large language models (LLMs) including several versions of ChatGPT. These general-purpose generative AI chatbots are not specifically trained to do well in narrow decision making tasks, but are trained instead as ``language predictors'' using a large corpus of textual data from the web. We show that ChatGPT is also subject to biases that result in suboptimal decisions. However we document a rapid evolution in the performance of ChatGPT from sub-human performance for early versions (ChatGPT 3.5) to superhuman and nearly perfect Bayesian classifications in the latest versions (ChatGPT 4o).

large language model, machine learning, natural language, (20 more...)

2504.10636

Country:

North America > United States > Wisconsin (0.15)
North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Häggström, Henrik, Persson, Sebastian, Cvijovic, Marija, Picchini, Umberto

Simulation-based inference for stochastic nonlinear mixed-effects models with applications in systems biology

arXiv.org Machine LearningApr-15-2025

The analysis of data from multiple experiments, such as observations of several individuals, is commonly approached using mixed-effects models, which account for variation between individuals through hierarchical representations. This makes mixed-effects models widely applied in fields such as biology, pharmacokinetics, and sociology. In this work, we propose a novel methodology for scalable Bayesian inference in hierarchical mixed-effects models. Our framework first constructs amortized approximations of the likelihood and the posterior distribution, which are then rapidly refined for each individual dataset, to ultimately approximate the parameters posterior across many individuals. The framework is easily trainable, as it uses mixtures of experts but without neural networks, leading to parsimonious yet expressive surrogate models of the likelihood and the posterior. We demonstrate the effectiveness of our methodology using challenging stochastic models, such as mixed-effects stochastic differential equations emerging in systems biology-driven problems. However, the approach is broadly applicable and can accommodate both stochastic and deterministic models. We show that our approach can seamlessly handle inference for many parameters. Additionally, we applied our method to a real-data case study of mRNA transfection. When compared to exact pseudomarginal Bayesian inference, our approach proved to be both fast and competitive in terms of statistical accuracy.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2504.11279

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Li, Chengkun, Huggins, Bobby, Mikkola, Petrus, Acerbi, Luigi

Normalizing Flow Regression for Bayesian Inference with Offline Likelihood Evaluations

arXiv.org Machine LearningApr-15-2025

Bayesian inference provides a principled framework for quantifying uncertainty in both parameters and models by computing full posterior distributions and model evidence (Gelman et al., 2013). However, Bayesian inference is often analytically intractable, requiring the use of approximate methods like Markov chain Monte Carlo (MCMC; Brooks, 2011) or variational inference (VI; Blei et al., 2017). These methods typically necessitate repeated evaluations of the target density, and many require differentiability of the model (Neal, 2011; Kucukelbir et al., 2017). When model evaluations are computationally expensive - for instance, involving extensive numerical methods - these requirements make standard Bayesian approaches impractical. Due to these computational demands, practitioners often resort to simpler alternatives such as maximum a posteriori (MAP) estimation or maximum likelihood estimation (MLE); 1 see for example Wilson and Collins (2019); Ma et al. (2023). While these point estimates can provide useful insights, they fail to capture parameter uncertainty, potentially leading to overconfident or biased conclusions (Gelman et al., 2013). This limitation highlights the need for efficient posterior approximation methods that avoid the computational costs of standard inference techniques.1.

artificial intelligence, evaluation, machine learning, (14 more...)

2504.11554

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
North America > United States > New Jersey (0.04)
(3 more...)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)