Bayesian Inference
Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis
Jantre, Sanket, Wang, Tianle, Park, Gilchan, Chopra, Kriti, Jeon, Nicholas, Qian, Xiaoning, Urban, Nathan M., Yoon, Byung-Jun
Identification of protein-protein interactions (PPIs) helps derive cellular mechanistic understanding, particularly in the context of complex conditions such as neurodegenerative disorders, metabolic syndromes, and cancer. Large Language Models (LLMs) have demonstrated remarkable potential in predicting protein structures and interactions via automated mining of vast biomedical literature; yet their inherent uncertainty remains a key challenge for deriving reproducible findings, critical for biomedical applications. In this study, we present an uncertainty-aware adaptation of LLMs for PPI analysis, leveraging fine-tuned LLaMA-3 and BioMedGPT models. To enhance prediction reliability, we integrate LoRA ensembles and Bayesian LoRA models for uncertainty quantification (UQ), ensuring confidence-calibrated insights into protein behavior. Our approach achieves competitive performance in PPI identification across diverse disease contexts while addressing model uncertainty, thereby enhancing trustworthiness and reproducibility in computational biology. These findings underscore the potential of uncertainty-aware LLM adaptation for advancing precision medicine and biomedical research.
Epistemic Uncertainty in Conformal Scores: A Unified Approach
Cabezas, Luben M. C., Santos, Vagner S., Ramos, Thiago R., Izbicki, Rafael
Conformal prediction methods create prediction bands with distribution-free guarantees but do not explicitly capture epistemic uncertainty, which can lead to overconfident predictions in data-sparse regions. Although recent conformal scores have been developed to address this limitation, they are typically designed for specific tasks, such as regression or quantile regression. Moreover, they rely on particular modeling choices for epistemic uncertainty, restricting their applicability. We introduce $\texttt{EPICSCORE}$, a model-agnostic approach that enhances any conformal score by explicitly integrating epistemic uncertainty. Leveraging Bayesian techniques such as Gaussian Processes, Monte Carlo Dropout, or Bayesian Additive Regression Trees, $\texttt{EPICSCORE}$ adaptively expands predictive intervals in regions with limited data while maintaining compact intervals where data is abundant. As with any conformal method, it preserves finite-sample marginal coverage. Additionally, it also achieves asymptotic conditional coverage. Experiments demonstrate its good performance compared to existing methods. Designed for compatibility with any Bayesian model, but equipped with distribution-free guarantees, $\texttt{EPICSCORE}$ provides a general-purpose framework for uncertainty quantification in prediction problems.
Covariates-Adjusted Mixed-Membership Estimation: A Novel Network Model with Optimal Guarantees
Fan, Jianqing, Ge, Jiawei, Hou, Jikai
This paper addresses the problem of mixed-membership estimation in networks, where the goal is to efficiently estimate the latent mixed-membership structure from the observed network. Recognizing the widespread availability and valuable information carried by node covariates, we propose a novel network model that incorporates both community information, as represented by the Degree-Corrected Mixed Membership (DCMM) model, and node covariate similarities to determine connections. We investigate the regularized maximum likelihood estimation (MLE) for this model and demonstrate that our approach achieves optimal estimation accuracy for both the similarity matrix and the mixed-membership, in terms of both the Frobenius norm and the entrywise loss. Since directly analyzing the original convex optimization problem is intractable, we employ nonconvex optimization to facilitate the analysis. A key contribution of our work is identifying a crucial assumption that bridges the gap between convex and nonconvex solutions, enabling the transfer of statistical guarantees from the nonconvex approach to its convex counterpart. Importantly, our analysis extends beyond the MLE loss and the mean squared error (MSE) used in matrix completion problems, generalizing to all the convex loss functions. Consequently, our analysis techniques extend to a broader set of applications, including ranking problems based on pairwise comparisons. Finally, simulation experiments validate our theoretical findings, and real-world data analyses confirm the practical relevance of our model.
No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers
He, Jiajun, Du, Yuanqi, Vargas, Francisco, Zhang, Dinghuai, Padhy, Shreyas, OuYang, RuiKang, Gomes, Carla, Hernรกndez-Lobato, Josรฉ Miguel
We consider the sampling problem, where the aim is to draw samples from a distribution whose density is known only up to a normalization constant. Recent breakthroughs in generative modeling to approximate a high-dimensional data distribution have sparked significant interest in developing neural network-based methods for this challenging problem. However, neural samplers typically incur heavy computational overhead due to simulating trajectories during training. This motivates the pursuit of simulation-free training procedures of neural samplers. In this work, we propose an elegant modification to previous methods, which allows simulation-free training with the help of a time-dependent normalizing flow. However, it ultimately suffers from severe mode collapse. On closer inspection, we find that nearly all successful neural samplers rely on Langevin preconditioning to avoid mode collapsing. We systematically analyze several popular methods with various objective functions and demonstrate that, in the absence of Langevin preconditioning, most of them fail to adequately cover even a simple target. Finally, we draw attention to a strong baseline by combining the state-of-the-art MCMC method, Parallel Tempering (PT), with an additional generative model to shed light on future explorations of neural samplers.
Adversarial Transform Particle Filters
Gong, Chengxin, Lin, Wei, Zhang, Cheng
The particle filter (PF) and the ensemble Kalman filter (EnKF) are widely used for approximate inference in state-space models. From a Bayesian perspective, these algorithms represent the prior by an ensemble of particles and update it to the posterior with new observations over time. However, the PF often suffers from weight degeneracy in high-dimensional settings, whereas the EnKF relies on linear Gaussian assumptions that can introduce significant approximation errors. In this paper, we propose the Adversarial Transform Particle Filter (ATPF), a novel filtering framework that combines the strengths of the PF and the EnKF through adversarial learning. Specifically, importance sampling is used to ensure statistical consistency as in the PF, while adversarially learned transformations, such as neural networks, allow accurate posterior matching for nonlinear and non-Gaussian systems. In addition, we incorporate kernel methods to ease optimization and leverage regularization techniques based on optimal transport for better statistical properties and numerical stability. We provide theoretical guarantees, including generalization bounds for both the analysis and forecast steps of ATPF. Extensive experiments across various nonlinear and non-Gaussian scenarios demonstrate the effectiveness and practical advantages of our method.
Neurons as Monte Carlo Samplers: Bayesian ๏ฟผInference and Learning in Spiking Networks
We propose a spiking network model capable of performing both approximate inference and learning for any hidden Markov model. The lower layer sensory neurons detect noisy measurements of hidden world states. The higher layer neurons with recurrent connections infer a posterior distribution over world states from spike trains generated by sensory neurons. We show how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in the population of inference neurons represents a sample of a particular hidden world state.
Minimax-optimal Inference from Partial Rankings
Bruce Hajek, Sewoong Oh, Jiaming Xu
This paper studies the problem of rank aggregation under the Plackett-Luce model. The goal is to infer a global ranking and related scores of the items, based on partial rankings provided by multiple users over multiple subsets of items. A question of particular interest is how to optimally assign items to users for ranking and how many item assignments are needed to achieve a target estimation error. Without any assumptions on how the items are assigned to users, we derive an oracle lower bound and the Cramรฉr-Rao lower bound of the estimation error. We prove an upper bound on the estimation error achieved by the maximum likelihood estimator, and show that both the upper bound and the Cramรฉr-Rao lower bound inversely depend on the spectral gap of the Laplacian of an appropriately defined comparison graph. Since random comparison graphs are known to have large spectral gaps, this suggests the use of random assignments when we have the control. Precisely, the matching oracle lower bound and the upper bound on the estimation error imply that the maximum likelihood estimator together with a random assignment is minimax-optimal up to a logarithmic factor. We further analyze a popular rankbreaking scheme that decompose partial rankings into pairwise comparisons. We show that even if one applies the mismatched maximum likelihood estimator that assumes independence (on pairwise comparisons that are now dependent due to rank-breaking), minimax optimal performance is still achieved up to a logarithmic factor.
General Table Completion using a Bayesian Nonparametric Model
Isabel Valera, Zoubin Ghahramani
Even though heterogeneous databases can be found in a broad variety of applications, there exists a lack of tools for estimating missing data in such databases. In this paper, we provide an efficient and robust table completion tool, based on a Bayesian nonparametric latent feature model. In particular, we propose a general observation model for the Indian buffet process (IBP) adapted to mixed continuous (real-valued and positive real-valued) and discrete (categorical, ordinal and count) observations. Then, we propose an inference algorithm that scales linearly with the number of observations. Finally, our experiments over five real databases show that the proposed approach provides more robust and accurate estimates than the standard IBP and the Bayesian probabilistic matrix factorization with Gaussian observations.
Augur: Data-Parallel Probabilistic Modeling
Jean-Baptiste Tristan, Daniel Huang, Joseph Tassarotti, Adam C. Pocock, Stephen Green, Guy L. Steele
Implementing inference procedures for each new probabilistic model is timeconsuming and error-prone. Probabilistic programming addresses this problem by allowing a user to specify the model and then automatically generating the inference procedure. To make this practical it is important to generate high performance inference code. In turn, on modern architectures, high performance requires parallel execution. In this paper we present Augur, a probabilistic modeling language and compiler for Bayesian networks designed to make effective use of data-parallel architectures such as GPUs. We show that the compiler can generate data-parallel inference code scalable to thousands of GPU cores by making use of the conditional independence relationships in the Bayesian network.