Bayesian Inference
Empirical Bayes for Dynamic Bayesian Networks Using Generalized Variational Inference
Kungurtsev, Vyacheslav, Apaar, null, Khandelwal, Aarya, Rastogi, Parth Sandeep, Chatterjee, Bapi, Mareček, Jakub
Dynamic Bayesian Networks (DBNs) are a class of Probabilistic Graphical Models that enable the modeling of a Markovian dynamic process through defining the kernel transition by the DAG structure of the graph found to fit a dataset. There are a number of structure learners than enable one to find the structure of a DBN to fit data, each of which with its own set of particular advantages and disadvantages. The structure of a DBN itself presents transparent criteria in order to identify causal discovery between variables. However, without the presence of large quantities of data, identifying a ground truth causal structure becomes unrealistic in practice. However, one can consider a procedure by which a set of graphs identifying structure are computed as approximate noisy solutions, and subsequently amortized in a broader statistical procedure fitting a mixture of DBNs. Each component of the mixture presents an alternative hypothesis on the causal structure. From the mixture weights, one can also compute the Bayes Factors comparing the preponderance of evidence between different models. This presents a natural opportunity for the development of Empirical Bayesian methods.
Linear Noise Approximation Assisted Bayesian Inference on Mechanistic Model of Partially Observed Stochastic Reaction Network
Partially observed stochastic reaction network (SRN) modeling the dynamics of a population of interacting species, such as chemical molecules participating in multiple reactions, is the fundamental building block of multi-scale bioprocess mechanistic model characterizing the causal interdependences from molecular-to macro-kinetics. It plays a critical role to: (1) facilitate digital twin development and support mechanism learning for biomanufacturing processes; (2) allow us to probe critical latent state based on partially observed information; and (3) serve as a fundamental model for a biofoundry platform [1] that can integrate heterogeneous online and offline measures collected from different manufacturing processes and speed up the bioprocess development with much less experiments. Model inference on the SRN mechanistic model based on heterogeneous data also helps to strengthen the theoretical foundations of federated learning on bioprocess mechanisms, through which we can train and advance knowledge. The SRN mechanistic model has three key features that make the model inference challenging. First, the continuoustime state transition model, representing the evolution of concentration or number of molecules, is highly nonlinear.
Exact Bayesian Gaussian Cox Processes Using Random Integral
Tang, Bingjing, Palacios, Julia
A Gaussian Cox process is a popular model for point process data, in which the intensity function is a transformation of a Gaussian process. Posterior inference of this intensity function involves an intractable integral (i.e., the cumulative intensity function) in the likelihood resulting in doubly intractable posterior distribution. Here, we propose a nonparametric Bayesian approach for estimating the intensity function of an inhomogeneous Poisson process without reliance on large data augmentation or approximations of the likelihood function. We propose to jointly model the intensity and the cumulative intensity function as a transformed Gaussian process, allowing us to directly bypass the need of approximating the cumulative intensity function in the likelihood. We propose an exact MCMC sampler for posterior inference and evaluate its performance on simulated data. We demonstrate the utility of our method in three real-world scenarios including temporal and spatial event data, as well as aggregated time count data collected at multiple resolutions. Finally, we discuss extensions of our proposed method to other point processes.
Digital Twin Calibration for Biological System-of-Systems: Cell Culture Manufacturing Process
Cheng, Fuqiang, Xie, Wei, Zheng, Hua
To support interpretable predictions and optimal control of biomanfuacturing processes, in this paper, we develop a digital twin calibration approach for multi-scale bioprocess mechanistic model or Biological System-of-Systems (Bio-SoS) [Zheng et al., 2024] characterizing causal interdependence from molecular-to cellular-to macro-kinetics. Even though this study is motivated by cell culture process, it can be extended to calibrate general Bio-SoS with modular design. Basically, cell culture process dynamics and variations depend on the modules: (1) a single cell mechanistic model characterizing each living cell behaviors and their interactions with environment; (2) a metabolic shift model characterizing the change of cell metabolic phase and behaviors as a response to culture conditions and cell age; and (3) macro-kinetic model of a bioreactor system composed of many living cells under different metabolic phases. The benefits of considering the Bio-SoS mechanistic model with modular design include: a) support flexible manufacturing through assembling a system of modules to account for biomanufacturing processes under different conditions and inputs; and b) facilitate the integration of heterogeneous data from different production processes, such as 2D culture and 3D aggregate culture for Induced Pluripotent Stem Cells (iPSCs) [Wang et al., 2024, Zheng et al., 2024]. By incorporating the structure property of the Bio-SoS mechanistic model into the calibration method, we can quantify how the model uncertainties or approximation errors of different modules interact with each other and propagate through the reaction pathways to the prediction of outputs (e.g., yield and product quality attributes), which can guide interpretable and most informative Design of Experiments (DoEs) to efficiently improve model fidelity with less experiments. The model uncertainty quantification approaches for digital twin calibration can be divided into two main categories: Bayesian and frequentist approaches [Corlu et al., 2020]. Bayesian approaches treat unknown model parameters as random variables and quantify our belief by posterior distributions. It involves specifying prior distributions for model parameters and updating these distributions based on the information from observed data by applying Bayes' theorem.
Learning topological states from randomized measurements using variational tensor network tomography
Teng, Yanting, Samajdar, Rhine, Van Kirk, Katherine, Wilde, Frederik, Sachdev, Subir, Eisert, Jens, Sweke, Ryan, Najafi, Khadijeh
Learning faithful representations of quantum states is crucial to fully characterizing the variety of many-body states created on quantum processors. While various tomographic methods such as classical shadow and MPS tomography have shown promise in characterizing a wide class of quantum states, they face unique limitations in detecting topologically ordered two-dimensional states. To address this problem, we implement and study a heuristic tomographic method that combines variational optimization on tensor networks with randomized measurement techniques. Using this approach, we demonstrate its ability to learn the ground state of the surface code Hamiltonian as well as an experimentally realizable quantum spin liquid state. In particular, we perform numerical experiments using MPS ans\"atze and systematically investigate the sample complexity required to achieve high fidelities for systems of sizes up to $48$ qubits. In addition, we provide theoretical insights into the scaling of our learning algorithm by analyzing the statistical properties of maximum likelihood estimation. Notably, our method is sample-efficient and experimentally friendly, only requiring snapshots of the quantum state measured randomly in the $X$ or $Z$ bases. Using this subset of measurements, our approach can effectively learn any real pure states represented by tensor networks, and we rigorously prove that random-$XZ$ measurements are tomographically complete for such states.
FI-CBL: A Probabilistic Method for Concept-Based Learning with Expert Rules
Utkin, Lev V., Konstantinov, Andrei V., Kirpichenko, Stanislav R.
A method for solving concept-based learning (CBL) problem is proposed. The main idea behind the method is to divide each concept-annotated image into patches, to transform the patches into embeddings by using an autoencoder, and to cluster the embeddings assuming that each cluster will mainly contain embeddings of patches with certain concepts. To find concepts of a new image, the method implements the frequentist inference by computing prior and posterior probabilities of concepts based on rates of patches from images with certain values of the concepts. Therefore, the proposed method is called the Frequentist Inference CBL (FI-CBL). FI-CBL allows us to incorporate the expert rules in the form of logic functions into the inference procedure. An idea behind the incorporation is to update prior and conditional probabilities of concepts to satisfy the rules. The method is transparent because it has an explicit sequence of probabilistic calculations and a clear frequency interpretation. Numerical experiments show that FI-CBL outperforms the concept bottleneck model in cases when the number of training data is small. The code of proposed algorithms is publicly available.
Optimistic Information Directed Sampling
Neu, Gergely, Papini, Matteo, Schwartz, Ludovic
We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory of information-directed sampling due to Russo and Van Roy (2018) and the worst-case theory of Foster, Kakade, Qian, and Rakhlin (2021) based on the decision-estimation coefficient. Drawing from both lines of work, we propose a algorithmic template called Optimistic Information-Directed Sampling and show that it can achieve instance-dependent regret guarantees similar to the ones achievable by the classic Bayesian IDS method, but with the major advantage of not requiring any Bayesian assumptions. The key technical innovation of our analysis is introducing an optimistic surrogate model for the regret and using it to define a frequentist version of the Information Ratio of Russo and Van Roy (2018), and a less conservative version of the Decision Estimation Coefficient of Foster et al. (2021). Keywords: Contextual bandits, information-directed sampling, decision estimation coefficient, first-order regret bounds.
Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?
Hase, Peter, Hofweber, Thomas, Zhou, Xiang, Stengel-Eskin, Elias, Bansal, Mohit
The model editing problem concerns how language models should learn new facts about the world over time. While empirical research on model editing has drawn widespread attention, the conceptual foundations of model editing remain shaky -- perhaps unsurprisingly, since model editing is essentially belief revision, a storied problem in philosophy that has eluded succinct solutions for decades. Model editing nonetheless demands a solution, since we need to be able to control the knowledge within language models. With this goal in mind, this paper critiques the standard formulation of the model editing problem and proposes a formal testbed for model editing research. We first describe 12 open problems with model editing, based on challenges with (1) defining the problem, (2) developing benchmarks, and (3) assuming LLMs have editable beliefs in the first place. Many of these challenges are extremely difficult to address, e.g. determining far-reaching consequences of edits, labeling probabilistic entailments between facts, and updating beliefs of agent simulators. Next, we introduce a semi-synthetic dataset for model editing based on Wikidata, where we can evaluate edits against labels given by an idealized Bayesian agent. This enables us to say exactly how belief revision in language models falls short of a desirable epistemic standard. We encourage further research exploring settings where such a gold standard can be compared against. Our code is publicly available at: https://github.com/peterbhase/LLM-belief-revision
Sequential three-way group decision-making for double hierarchy hesitant fuzzy linguistic term set
Luo, Nanfang, Zhang, Qinghua, Xie, Qin, Wang, Yutai, Yin, Longjun, Wang, Guoyin
Group decision-making (GDM) characterized by complexity and uncertainty is an essential part of various life scenarios. Most existing researches lack tools to fuse information quickly and interpret decision results for partially formed decisions. This limitation is particularly noticeable when there is a need to improve the efficiency of GDM. To address this issue, a novel multi-level sequential three-way decision for group decision-making (S3W-GDM) method is constructed from the perspective of granular computing. This method simultaneously considers the vagueness, hesitation, and variation of GDM problems under double hierarchy hesitant fuzzy linguistic term sets (DHHFLTS) environment. First, for fusing information efficiently, a novel multi-level expert information fusion method is proposed, and the concepts of expert decision table and the extraction/aggregation of decision-leveled information based on the multi-level granularity are defined. Second, the neighborhood theory, outranking relation and regret theory (RT) are utilized to redesign the calculations of conditional probability and relative loss function. Then, the granular structure of DHHFLTS based on the sequential three-way decision (S3WD) is defined to improve the decision-making efficiency, and the decision-making strategy and interpretation of each decision-level are proposed. Furthermore, the algorithm of S3W-GDM is given. Finally, an illustrative example of diagnosis is presented, and the comparative and sensitivity analysis with other methods are performed to verify the efficiency and rationality of the proposed method.
Notes on Kalman Filter (KF, EKF, ESKF, IEKF, IESKF)
The Kalman Filter (KF) is a powerful mathematical tool widely used for state estimation in various domains, including Simultaneous Localization and Mapping (SLAM). This paper presents an in-depth introduction to the Kalman Filter and explores its several extensions: the Extended Kalman Filter (EKF), the Error-State Kalman Filter (ESKF), the Iterated Extended Kalman Filter (IEKF), and the Iterated Error-State Kalman Filter (IESKF). Each variant is meticulously examined, with detailed derivations of their mathematical formulations and discussions on their respective advantages and limitations. By providing a comprehensive overview of these techniques, this paper aims to offer valuable insights into their applications in SLAM and enhance the understanding of state estimation methodologies in complex environments.