Bayesian Learning
Diffusion models for inverse problems
Chung, Hyungjin, Kim, Jeongsol, Ye, Jong Chul
Using diffusion priors to solve inverse problems in imaging have significantly matured over the years. In this chapter, we review the various different approaches that were proposed over the years. We categorize the approaches into the more classic explicit approximation approaches and others, which include variational inference, sequential monte carlo, and decoupled data consistency. We cover the extension to more challenging situations, including blind cases, high-dimensional data, and problems under data scarcity and distribution mismatch. More recent approaches that aim to leverage multimodal information through texts are covered. Through this chapter, we aim to (i) distill the common mathematical threads that connect these algorithms, (ii) systematically contrast their assumptions and performance trade-offs across representative inverse problems, and (iii) spotlight the open theoretical and practical challenges by clarifying the landscape of diffusion model based inverse problem solvers.
Trustworthy scientific inference for inverse problems with generative models
Carzon, James, Masserano, Luca, Ingram, Joshua D., Shen, Alex, Junior, Antonio Carlos Herling Ribeiro, Dorigo, Tommaso, Doro, Michele, Speagle, Joshua S., Izbicki, Rafael, Lee, Ann B.
Generative artificial intelligence (AI) excels at producing complex data structures (text, images, videos) by learning patterns from training examples. Across scientific disciplines, researchers are now applying generative models to ``inverse problems'' to infer hidden parameters from observed data. While these methods can handle intractable models and large-scale studies, they can also produce biased or overconfident conclusions. We present a solution with Frequentist-Bayes (FreB), a mathematically rigorous protocol that reshapes AI-generated probability distributions into confidence regions that consistently include true parameters with the expected probability, while achieving minimum size when training and target data align. We demonstrate FreB's effectiveness by tackling diverse case studies in the physical sciences: identifying unknown sources under dataset shift, reconciling competing theoretical models, and mitigating selection bias and systematics in observational studies. By providing validity guarantees with interpretable diagnostics, FreB enables trustworthy scientific inference across fields where direct likelihood evaluation remains impossible or prohibitively expensive.
Instance-Dependent Continuous-Time Reinforcement Learning via Maximum Likelihood Estimation
Zhao, Runze, Yu, Yue, Wang, Ruhan, Huang, Chunfeng, Zhou, Dongruo
Continuous-time reinforcement learning (CTRL) provides a natural framework for sequential decision-making in dynamic environments where interactions evolve continuously over time. While CTRL has shown growing empirical success, its ability to adapt to varying levels of problem difficulty remains poorly understood. In this work, we investigate the instance-dependent behavior of CTRL and introduce a simple, model-based algorithm built on maximum likelihood estimation (MLE) with a general function approximator. Unlike existing approaches that estimate system dynamics directly, our method estimates the state marginal density to guide learning. We establish instance-dependent performance guarantees by deriving a regret bound that scales with the total reward variance and measurement resolution. Notably, the regret becomes independent of the specific measurement strategy when the observation frequency adapts appropriately to the problem's complexity. To further improve performance, our algorithm incorporates a randomized measurement schedule that enhances sample efficiency without increasing measurement cost. These results highlight a new direction for designing CTRL algorithms that automatically adjust their learning behavior based on the underlying difficulty of the environment.
Uncertainty Quantification for Large-Scale Deep Networks via Post-StoNet Modeling
Deep learning has revolutionized modern data science. However, how to accurately quantify the uncertainty of predictions from large-scale deep neural networks (DNNs) remains an unresolved issue. To address this issue, we introduce a novel post-processing approach. This approach feeds the output from the last hidden layer of a pre-trained large-scale DNN model into a stochastic neural network (StoNet), then trains the StoNet with a sparse penalty on a validation dataset and constructs prediction intervals for future observations. We establish a theoretical guarantee for the validity of this approach; in particular, the parameter estimation consistency for the sparse StoNet is essential for the success of this approach. Comprehensive experiments demonstrate that the proposed approach can construct honest confidence intervals with shorter interval lengths compared to conformal methods and achieves better calibration compared to other post-hoc calibration techniques. Additionally, we show that the StoNet formulation provides us with a platform to adapt sparse learning theory and methods from linear models to DNNs.
Consistent DAG selection for Bayesian causal discovery under general error distributions
Chaudhuri, Anamitra, Bhattacharya, Anirban, Ni, Yang
Learning causal structure in complex systems is a fundamental challenge across a broad range of disciplines, from traditional scientific fields to modern engineering and technology. Unlike conventional statistical methods that focus merely on correlation, the field of causal discovery primarily considers the problem of discovering the directionality and strength of causal relationships between variables, often from observational data. Thus, it has become a critical tool for researchers aiming to predict the effects of interventions on the systems, especially where controlled experimentation may be expensive, unethical, or even infeasible. Such necessities arise not only in various areas of natural science, such as epidemiology [56], public health [65], genomics [14], neuroscience [86], and climate and environmental science [60], but also in numerous domains in social science, such as psychology [50], philosophy [26], and economics [37]. Moreover, with recent advances in science and technology and the increase in size and complexity of data generation processes, causal discovery has acquired significant relevance in the fields of machine learning [63] and artificial intelligence [81, 82] through various emerging areas such as causal representation learning [64, 85], causal transfer learning [83], causal algorithmic fairness [84], and causal reinforcement learning [5]. This work focuses on learning causal structures from purely observational data within the framework of causal Bayesian networks, which are widely used to represent causal relationships among variables through directed acyclic graphs (DAGs). This is, in general, a nontrivial and difficult task due to the vast number of potential DAG structures and multiple DAGs representing the same set of conditional independence relationships. In fact, DAGs are generally identifiable only up to their corresponding Markov equivalence class, in which all DAGs encode the same conditional independencies [31].
Frugal, Flexible, Faithful: Causal Data Simulation via Frengression
Yang, Linying, Evans, Robin J., Shen, Xinwei
The use of machine learning tools has given causal inference a new lease of life, enabling complex models to be used with principled causal estimators and guarantees about statistically important quantities (Wager and Athey, 2018; Chernozhukov et al., 2018; Hahn et al., 2020). To build trustworthy causal models, however, we also need to understand when these methods may be more or less reliable, or perhaps fail completely. This implies that causal inference needs a set of good benchmarking tools. Unfortunately, real-world datasets are not ideal for this task, because they cannot give us access to the ground truth other than in a few very special circumstances. In particular, they rarely provide the counterfactual outcomes we care about, and the distribution we want to evaluate often differs from the one that produced the observations. Well-designed simulations can address this discrepancy (Neal et al., 2020; Parikh et al., 2022); they allow us to choose a ground truth, stress-test new methods, compare their generalizability and stability, and expose failure modes before deployment.
Understanding the Essence: Delving into Annotator Prototype Learning for Multi-Class Annotation Aggregation
Chen, Ju, Feng, Jun, Zhang, Shenyu
Multi-class classification annotations have significantly advanced AI applications, with truth inference serving as a critical technique for aggregating noisy and biased annotations. Existing state-of-the-art methods typically model each annotator's expertise using a confusion matrix. However, these methods suffer from two widely recognized issues: 1) when most annotators label only a few tasks, or when classes are imbalanced, the estimated confusion matrices are unreliable, and 2) a single confusion matrix often remains inadequate for capturing each annotator's full expertise patterns across all tasks. To address these issues, we propose a novel confusion-matrix-based method, PTBCC (ProtoType learning-driven Bayesian Classifier Combination), to introduce a reliable and richer annotator estimation by prototype learning. Specifically, we assume that there exists a set $S$ of prototype confusion matrices, which capture the inherent expertise patterns of all annotators. Rather than a single confusion matrix, the expertise per annotator is extended as a Dirichlet prior distribution over these prototypes. This prototype learning-driven mechanism circumvents the data sparsity and class imbalance issues, ensuring a richer and more flexible characterization of annotators. Extensive experiments on 11 real-world datasets demonstrate that PTBCC achieves up to a 15% accuracy improvement in the best case, and a 3% higher average accuracy while reducing computational cost by over 90%.
Actionable Counterfactual Explanations Using Bayesian Networks and Path Planning with Applications to Environmental Quality Improvement
Valero-Leal, Enrique, Larraรฑaga, Pedro, Bielza, Concha
Counterfactual explanations study what should have changed in order to get an alternative result, enabling end-users to understand machine learning mechanisms with counterexamples. Actionability is defined as the ability to transform the original case to be explained into a counterfactual one. We develop a method for actionable counterfactual explanations that, unlike predecessors, does not directly leverage training data. Rather, data is only used to learn a density estimator, creating a search landscape in which to apply path planning algorithms to solve the problem and masking the endogenous data, which can be sensitive or private. We put special focus on estimating the data density using Bayesian networks, demonstrating how their enhanced interpretability is useful in high-stakes scenarios in which fairness is raising concern. Using a synthetic benchmark comprised of 15 datasets, our proposal finds more actionable and simpler counterfactuals than the current state-of-the-art algorithms. We also test our algorithm with a real-world Environmental Protection Agency dataset, facilitating a more efficient and equitable study of policies to improve the quality of life in United States of America counties. Our proposal captures the interaction of variables, ensuring equity in decisions, as policies to improve certain domains of study (air, water quality, etc.) can be detrimental in others. In particular, the sociodemographic domain is often involved, where we find important variables related to the ongoing housing crisis that can potentially have a severe negative impact on communities.
Bayes-Entropy Collaborative Driven Agents for Research Hypotheses Generation and Optimization
Duan, Shiyang, Tian, Yuan, Bing, Qi, Shao, Xiaowei
The exponential growth of scientific knowledge has made the automated generation of scientific hypotheses that combine novelty, feasibility, and research value a core challenge. Existing methods based on large language models fail to systematically model the inherent in hypotheses or incorporate the closed-loop feedback mechanisms crucial for refinement. This paper proposes a multi-agent collaborative framework called HypoAgents, which for the first time integrates Bayesian reasoning with an information entropy-driven search mechanism across three stages-hypotheses generation, evidence validation, and hypotheses Refinement-to construct an iterative closed-loop simulating scientists' cognitive processes. Specifically, the framework first generates an initial set of hypotheses through diversity sampling and establishes prior beliefs based on a composite novelty-relevance-feasibility (N-R-F) score. It then employs etrieval-augmented generation (RAG) to gather external literature evidence, updating the posterior probabilities of hypotheses using Bayes' theorem. Finally, it identifies high-uncertainty hypotheses using information entropy $H = - \sum {{p_i}\log {p_i}}$ and actively refines them, guiding the iterative optimization of the hypothesis set toward higher quality and confidence. Experimental results on the ICLR 2025 conference real-world research question dataset (100 research questions) show that after 12 optimization iterations, the average ELO score of generated hypotheses improves by 116.3, surpassing the benchmark of real paper abstracts by 17.8, while the framework's overall uncertainty, as measured by Shannon entropy, decreases significantly by 0.92. This study presents an interpretable probabilistic reasoning framework for automated scientific discovery, substantially improving the quality and reliability of machine-generated research hypotheses.
Rethinking Multimodality: Optimizing Multimodal Deep Learning for Biomedical Signal Classification
This study proposes a novel perspective on multimodal deep learning for biomedical signal classification, systematically analyzing how complementary feature domains impact model performance. While fusing multiple domains often presumes enhanced accuracy, this work demonstrates that adding modalities can yield diminishing returns, as not all fusions are inherently advantageous. To validate this, five deep learning models were designed, developed, and rigorously evaluated: three unimodal (1D-CNN for time, 2D-CNN for time-frequency, and 1D-CNN-Transformer for frequency) and two multimodal (Hybrid 1, which fuses 1D-CNN and 2D-CNN; Hybrid 2, which combines 1D-CNN, 2D-CNN, and a Transformer). For ECG classification, bootstrapping and Bayesian inference revealed that Hybrid 1 consistently outperformed the 2D-CNN baseline across all metrics (p-values < 0.05, Bayesian probabilities > 0.90), confirming the synergistic complementarity of the time and time-frequency domains. Conversely, Hybrid 2's inclusion of the frequency domain offered no further improvement and sometimes a marginal decline, indicating representational redundancy; a phenomenon further substantiated by a targeted ablation study. This research redefines a fundamental principle of multimodal design in biomedical signal analysis. We demonstrate that optimal domain fusion isn't about the number of modalities, but the quality of their inherent complementarity. This paradigm-shifting concept moves beyond purely heuristic feature selection. Our novel theoretical contribution, "Complementary Feature Domains in Multimodal ECG Deep Learning," presents a mathematically quantifiable framework for identifying ideal domain combinations, demonstrating that optimal multimodal performance arises from the intrinsic information-theoretic complementarity among fused domains.