Goto

Collaborating Authors

 Bayesian Learning


Integrating occlusion awareness in urban motion prediction for enhanced autonomous vehicle navigation

arXiv.org Artificial Intelligence

Motion prediction is a key factor towards the full deployment of autonomous vehicles. It is fundamental in order to ensure safety while navigating through highly interactive and complex scenarios. Lack of visibility due to an obstructed view or sensor range poses a great safety issue for autonomous vehicles. The inclusion of occlusion in interaction-aware approaches is not very well explored in the literature. In this work, the MultIAMP framework, which produces multimodal probabilistic outputs from the integration of a Dynamic Bayesian Network and Markov chains, is extended to tackle occlusions. The framework is evaluated with a state-of-the-art motion planner in two realistic use cases.


Empirical Bayes for Dynamic Bayesian Networks Using Generalized Variational Inference

arXiv.org Artificial Intelligence

Dynamic Bayesian Networks (DBNs) are a class of Probabilistic Graphical Models that enable the modeling of a Markovian dynamic process through defining the kernel transition by the DAG structure of the graph found to fit a dataset. There are a number of structure learners than enable one to find the structure of a DBN to fit data, each of which with its own set of particular advantages and disadvantages. The structure of a DBN itself presents transparent criteria in order to identify causal discovery between variables. However, without the presence of large quantities of data, identifying a ground truth causal structure becomes unrealistic in practice. However, one can consider a procedure by which a set of graphs identifying structure are computed as approximate noisy solutions, and subsequently amortized in a broader statistical procedure fitting a mixture of DBNs. Each component of the mixture presents an alternative hypothesis on the causal structure. From the mixture weights, one can also compute the Bayes Factors comparing the preponderance of evidence between different models. This presents a natural opportunity for the development of Empirical Bayesian methods.


Stackelberg Games with $k$-Submodular Function under Distributional Risk-Receptiveness and Robustness

arXiv.org Artificial Intelligence

We study submodular optimization in adversarial context, applicable to machine learning problems such as feature selection using data susceptible to uncertainties and attacks. We focus on Stackelberg games between an attacker (or interdictor) and a defender where the attacker aims to minimize the defender's objective of maximizing a $k$-submodular function. We allow uncertainties arising from the success of attacks and inherent data noise, and address challenges due to incomplete knowledge of the probability distribution of random parameters. Specifically, we introduce Distributionally Risk-Averse $k$-Submodular Interdiction Problem (DRA $k$-SIP) and Distributionally Risk-Receptive $k$-Submodular Interdiction Problem (DRR $k$-SIP) along with finitely convergent exact algorithms for solving them. The DRA $k$-SIP solution allows risk-averse interdictor to develop robust strategies for real-world uncertainties. Conversely, DRR $k$-SIP solution suggests aggressive tactics for attackers, willing to embrace (distributional) risk to inflict maximum damage, identifying critical vulnerable components, which can be used for the defender's defensive strategies. The optimal values derived from both DRA $k$-SIP and DRR $k$-SIP offer a confidence interval-like range for the expected value of the defender's objective function, capturing distributional ambiguity. We conduct computational experiments using instances of feature selection and sensor placement problems, and Wisconsin breast cancer data and synthetic data, respectively.


Exact Bayesian Gaussian Cox Processes Using Random Integral

arXiv.org Machine Learning

A Gaussian Cox process is a popular model for point process data, in which the intensity function is a transformation of a Gaussian process. Posterior inference of this intensity function involves an intractable integral (i.e., the cumulative intensity function) in the likelihood resulting in doubly intractable posterior distribution. Here, we propose a nonparametric Bayesian approach for estimating the intensity function of an inhomogeneous Poisson process without reliance on large data augmentation or approximations of the likelihood function. We propose to jointly model the intensity and the cumulative intensity function as a transformed Gaussian process, allowing us to directly bypass the need of approximating the cumulative intensity function in the likelihood. We propose an exact MCMC sampler for posterior inference and evaluate its performance on simulated data. We demonstrate the utility of our method in three real-world scenarios including temporal and spatial event data, as well as aggregated time count data collected at multiple resolutions. Finally, we discuss extensions of our proposed method to other point processes.


Digital Twin Calibration for Biological System-of-Systems: Cell Culture Manufacturing Process

arXiv.org Machine Learning

To support interpretable predictions and optimal control of biomanfuacturing processes, in this paper, we develop a digital twin calibration approach for multi-scale bioprocess mechanistic model or Biological System-of-Systems (Bio-SoS) [Zheng et al., 2024] characterizing causal interdependence from molecular-to cellular-to macro-kinetics. Even though this study is motivated by cell culture process, it can be extended to calibrate general Bio-SoS with modular design. Basically, cell culture process dynamics and variations depend on the modules: (1) a single cell mechanistic model characterizing each living cell behaviors and their interactions with environment; (2) a metabolic shift model characterizing the change of cell metabolic phase and behaviors as a response to culture conditions and cell age; and (3) macro-kinetic model of a bioreactor system composed of many living cells under different metabolic phases. The benefits of considering the Bio-SoS mechanistic model with modular design include: a) support flexible manufacturing through assembling a system of modules to account for biomanufacturing processes under different conditions and inputs; and b) facilitate the integration of heterogeneous data from different production processes, such as 2D culture and 3D aggregate culture for Induced Pluripotent Stem Cells (iPSCs) [Wang et al., 2024, Zheng et al., 2024]. By incorporating the structure property of the Bio-SoS mechanistic model into the calibration method, we can quantify how the model uncertainties or approximation errors of different modules interact with each other and propagate through the reaction pathways to the prediction of outputs (e.g., yield and product quality attributes), which can guide interpretable and most informative Design of Experiments (DoEs) to efficiently improve model fidelity with less experiments. The model uncertainty quantification approaches for digital twin calibration can be divided into two main categories: Bayesian and frequentist approaches [Corlu et al., 2020]. Bayesian approaches treat unknown model parameters as random variables and quantify our belief by posterior distributions. It involves specifying prior distributions for model parameters and updating these distributions based on the information from observed data by applying Bayes' theorem.


Learning topological states from randomized measurements using variational tensor network tomography

arXiv.org Machine Learning

Learning faithful representations of quantum states is crucial to fully characterizing the variety of many-body states created on quantum processors. While various tomographic methods such as classical shadow and MPS tomography have shown promise in characterizing a wide class of quantum states, they face unique limitations in detecting topologically ordered two-dimensional states. To address this problem, we implement and study a heuristic tomographic method that combines variational optimization on tensor networks with randomized measurement techniques. Using this approach, we demonstrate its ability to learn the ground state of the surface code Hamiltonian as well as an experimentally realizable quantum spin liquid state. In particular, we perform numerical experiments using MPS ans\"atze and systematically investigate the sample complexity required to achieve high fidelities for systems of sizes up to $48$ qubits. In addition, we provide theoretical insights into the scaling of our learning algorithm by analyzing the statistical properties of maximum likelihood estimation. Notably, our method is sample-efficient and experimentally friendly, only requiring snapshots of the quantum state measured randomly in the $X$ or $Z$ bases. Using this subset of measurements, our approach can effectively learn any real pure states represented by tensor networks, and we rigorously prove that random-$XZ$ measurements are tomographically complete for such states.


FI-CBL: A Probabilistic Method for Concept-Based Learning with Expert Rules

arXiv.org Machine Learning

A method for solving concept-based learning (CBL) problem is proposed. The main idea behind the method is to divide each concept-annotated image into patches, to transform the patches into embeddings by using an autoencoder, and to cluster the embeddings assuming that each cluster will mainly contain embeddings of patches with certain concepts. To find concepts of a new image, the method implements the frequentist inference by computing prior and posterior probabilities of concepts based on rates of patches from images with certain values of the concepts. Therefore, the proposed method is called the Frequentist Inference CBL (FI-CBL). FI-CBL allows us to incorporate the expert rules in the form of logic functions into the inference procedure. An idea behind the incorporation is to update prior and conditional probabilities of concepts to satisfy the rules. The method is transparent because it has an explicit sequence of probabilistic calculations and a clear frequency interpretation. Numerical experiments show that FI-CBL outperforms the concept bottleneck model in cases when the number of training data is small. The code of proposed algorithms is publicly available.


Optimistic Information Directed Sampling

arXiv.org Artificial Intelligence

We study the problem of online learning in contextual bandit problems where the loss function is assumed to belong to a known parametric function class. We propose a new analytic framework for this setting that bridges the Bayesian theory of information-directed sampling due to Russo and Van Roy (2018) and the worst-case theory of Foster, Kakade, Qian, and Rakhlin (2021) based on the decision-estimation coefficient. Drawing from both lines of work, we propose a algorithmic template called Optimistic Information-Directed Sampling and show that it can achieve instance-dependent regret guarantees similar to the ones achievable by the classic Bayesian IDS method, but with the major advantage of not requiring any Bayesian assumptions. The key technical innovation of our analysis is introducing an optimistic surrogate model for the regret and using it to define a frequentist version of the Information Ratio of Russo and Van Roy (2018), and a less conservative version of the Decision Estimation Coefficient of Foster et al. (2021). Keywords: Contextual bandits, information-directed sampling, decision estimation coefficient, first-order regret bounds.


Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?

arXiv.org Artificial Intelligence

The model editing problem concerns how language models should learn new facts about the world over time. While empirical research on model editing has drawn widespread attention, the conceptual foundations of model editing remain shaky -- perhaps unsurprisingly, since model editing is essentially belief revision, a storied problem in philosophy that has eluded succinct solutions for decades. Model editing nonetheless demands a solution, since we need to be able to control the knowledge within language models. With this goal in mind, this paper critiques the standard formulation of the model editing problem and proposes a formal testbed for model editing research. We first describe 12 open problems with model editing, based on challenges with (1) defining the problem, (2) developing benchmarks, and (3) assuming LLMs have editable beliefs in the first place. Many of these challenges are extremely difficult to address, e.g. determining far-reaching consequences of edits, labeling probabilistic entailments between facts, and updating beliefs of agent simulators. Next, we introduce a semi-synthetic dataset for model editing based on Wikidata, where we can evaluate edits against labels given by an idealized Bayesian agent. This enables us to say exactly how belief revision in language models falls short of a desirable epistemic standard. We encourage further research exploring settings where such a gold standard can be compared against. Our code is publicly available at: https://github.com/peterbhase/LLM-belief-revision


Sequential three-way group decision-making for double hierarchy hesitant fuzzy linguistic term set

arXiv.org Artificial Intelligence

Group decision-making (GDM) characterized by complexity and uncertainty is an essential part of various life scenarios. Most existing researches lack tools to fuse information quickly and interpret decision results for partially formed decisions. This limitation is particularly noticeable when there is a need to improve the efficiency of GDM. To address this issue, a novel multi-level sequential three-way decision for group decision-making (S3W-GDM) method is constructed from the perspective of granular computing. This method simultaneously considers the vagueness, hesitation, and variation of GDM problems under double hierarchy hesitant fuzzy linguistic term sets (DHHFLTS) environment. First, for fusing information efficiently, a novel multi-level expert information fusion method is proposed, and the concepts of expert decision table and the extraction/aggregation of decision-leveled information based on the multi-level granularity are defined. Second, the neighborhood theory, outranking relation and regret theory (RT) are utilized to redesign the calculations of conditional probability and relative loss function. Then, the granular structure of DHHFLTS based on the sequential three-way decision (S3WD) is defined to improve the decision-making efficiency, and the decision-making strategy and interpretation of each decision-level are proposed. Furthermore, the algorithm of S3W-GDM is given. Finally, an illustrative example of diagnosis is presented, and the comparative and sensitivity analysis with other methods are performed to verify the efficiency and rationality of the proposed method.