Goto

Collaborating Authors

 Bayesian Inference


A Priori Determination of the Pretest Probability

arXiv.org Artificial Intelligence

In this manuscript, we present various proposed methods estimate the prevalence of disease, a critical prerequisite for the adequate interpretation of screening tests. To address the limitations of these approaches, which revolve primarily around their a posteriori nature, we introduce a novel method to estimate the pretest probability of disease, a priori, utilizing the Logit function from the logistic regression model. This approach is a modification of McGee's heuristic, originally designed for estimating the posttest probability of disease. In a patient presenting with $n_\theta$ signs or symptoms, the minimal bound of the pretest probability, $\phi$, can be approximated by: $\phi \approx \frac{1}{5}{ln\left[\displaystyle\prod_{\theta=1}^{i}\kappa_\theta\right]}$ where $ln$ is the natural logarithm, and $\kappa_\theta$ is the likelihood ratio associated with the sign or symptom in question.


QCM-SGM+: Improved Quantized Compressed Sensing With Score-Based Generative Models

arXiv.org Artificial Intelligence

In practical compressed sensing (CS), the obtained measurements typically necessitate quantization to a limited number of bits prior to transmission or storage. This nonlinear quantization process poses significant recovery challenges, particularly with extreme coarse quantization such as 1-bit. Recently, an efficient algorithm called QCS-SGM was proposed for quantized CS (QCS) which utilizes score-based generative models (SGM) as an implicit prior. Due to the adeptness of SGM in capturing the intricate structures of natural signals, QCS-SGM substantially outperforms previous QCS methods. However, QCS-SGM is constrained to (approximately) row-orthogonal sensing matrices as the computation of the likelihood score becomes intractable otherwise. To address this limitation, we introduce an advanced variant of QCS-SGM, termed QCS-SGM+, capable of handling general matrices effectively. The key idea is a Bayesian inference perspective on the likelihood score computation, wherein expectation propagation is employed for its approximate computation. Extensive experiments are conducted, demonstrating the substantial superiority of QCS-SGM+ over QCS-SGM for general sensing matrices beyond mere row-orthogonality.


Ensemble Kalman Filtering Meets Gaussian Process SSM for Non-Mean-Field and Online Inference

arXiv.org Machine Learning

Gaussian process state-space models (GPSSMs) are a versatile and principled family of nonlinear dynamical system models. However, existing variational learning and inference methods for GPSSMs often necessitate optimizing a substantial number of variational parameters, leading to inadequate performance and efficiency. To overcome this issue, we propose incorporating the ensemble Kalman filter (EnKF), a well-established model-based filtering technique, into the variational inference framework to approximate the posterior distribution of latent states. This utilization of EnKF can effectively exploit the dependencies between latent states and GP dynamics, while eliminating the need for parameterizing the variational distribution, thereby significantly reducing the number of variational parameters. Moreover, we show that our proposed algorithm allows straightforward evaluation of an approximated evidence lower bound (ELBO) in variational inference via simply summating multiple terms with readily available closed-form solutions. Leveraging automatic differentiation tools, we hence can maximize the ELBO and train the GPSSM efficiently. We also extend the proposed algorithm to accommodate an online setting and provide detailed algorithmic analyses and insights. Extensive evaluation on diverse real and synthetic datasets demonstrates the superiority of our EnKF-aided variational inference algorithms in terms of learning and inference performance compared to existing methods.


Asynchronous Local Computations in Distributed Bayesian Learning

arXiv.org Artificial Intelligence

Due to the expanding scope of machine learning (ML) to the fields of sensor networking, cooperative robotics and many other multi-agent systems, distributed deployment of inference algorithms has received a lot of attention. These algorithms involve collaboratively learning unknown parameters from dispersed data collected by multiple agents. There are two competing aspects in such algorithms, namely, intra-agent computation and inter-agent communication. Traditionally, algorithms are designed to perform both synchronously. However, certain circumstances need frugal use of communication channels as they are either unreliable, time-consuming, or resource-expensive. In this paper, we propose gossip-based asynchronous communication to leverage fast computations and reduce communication overhead simultaneously. We analyze the effects of multiple (local) intra-agent computations by the active agents between successive inter-agent communications. For local computations, Bayesian sampling via unadjusted Langevin algorithm (ULA) MCMC is utilized. The communication is assumed to be over a connected graph (e.g., as in decentralized learning), however, the results can be extended to coordinated communication where there is a central server (e.g., federated learning). We theoretically quantify the convergence rates in the process. To demonstrate the efficacy of the proposed algorithm, we present simulations on a toy problem as well as on real world data sets to train ML models to perform classification tasks. We observe faster initial convergence and improved performance accuracy, especially in the low data range. We achieve on average 78% and over 90% classification accuracy respectively on the Gamma Telescope and mHealth data sets from the UCI ML repository.


Heckerthoughts

arXiv.org Artificial Intelligence

This manuscript is technical memoir about my work at Stanford and Microsoft Research. Included are fundamental concepts central to machine learning and artificial intelligence, applications of these concepts, and stories behind their creation.


Neural Population Decoding and Imbalanced Multi-Omic Datasets For Cancer Subtype Diagnosis

arXiv.org Artificial Intelligence

Abstract: Recent strides in the field of neural computation has seen the adoption of Winner-Take-All (WTA) circuits to facilitate the unification of hierarchical Bayesian inference and spiking neural networks as a neurobiologically plausible model of information processing. However, researchers have not yet reached consensus about how best to translate the stochastic responses from these networks into discrete decisions, a process known as population decoding. Despite being an often underexamined part of SNNs, in this work we show that population decoding has a significanct impact on the classification performance of WTA networks. For this purpose, we apply a WTA network to the problem of cancer subtype diagnosis from multi-omic data, using datasets from The Cancer Genome Atlas (TCGA). In doing so we utilise a novel implementation of gene similarity networks, a feature encoding technique based on Kohoen's self-organising map algorithm. We further show that the impact of selecting certain population decoding methods is amplified when facing imbalanced datasets. Multi-omics data integration in cancer diagnosis Alternatively, some research focuses on the timedependent refers to the integration of information from various relationship of spiking neurons, for biological "omics" e.g., genomics, transcriptomics, instance by weighting neuron responses more highly metabolomics, to provide a more comprehensive based on how quickly they fire (Grรผn & Rotter, 2010; understanding of the molecular landscape of cancer. In order to extract information from SNNs, we neurobiologically inspired method of information examine the spikes generated by a population of processing which aim to solve tasks using plausible neurons in response to a stimulus.


Do Bayesian Neural Networks Improve Weapon System Predictive Maintenance?

arXiv.org Artificial Intelligence

This approach lacks the extra information on individual systems with interval-censored data and time-varying weapon system characteristics. A recent method introduced the covariates. We analyze and benchmark our approach, Weibull-Cox Bayesian Neural Network tested on several LaplaceNN, on synthetic and real datasets with standard weapon systems, albeit requiring a held-out validation set [7]. classification metrics such as Receiver Operating Characteristic Moreover, while understanding the population reliability trends (ROC) Area Under Curve (AUC) Precision-Recall (PR) AUC, via a Weibull distribution is informative, this formulation does and reliability curve visualizations.


RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$

arXiv.org Artificial Intelligence

Meta reinforcement learning (meta-RL) methods such as RL$^2$ have emerged as promising approaches for learning data-efficient RL algorithms tailored to a given task distribution. However, these RL algorithms struggle with long-horizon tasks and out-of-distribution tasks since they rely on recurrent neural networks to process the sequence of experiences instead of summarizing them into general RL components such as value functions. Moreover, even transformers have a practical limit to the length of histories they can efficiently reason about before training and inference costs become prohibitive. In contrast, traditional RL algorithms are data-inefficient since they do not leverage domain knowledge, but they do converge to an optimal policy as more data becomes available. In this paper, we propose RL$^3$, a principled hybrid approach that combines traditional RL and meta-RL by incorporating task-specific action-values learned through traditional RL as an input to the meta-RL neural network. We show that RL$^3$ earns greater cumulative reward on long-horizon and out-of-distribution tasks compared to RL$^2$, while maintaining the efficiency of the latter in the short term. Experiments are conducted on both custom and benchmark discrete domains from the meta-RL literature that exhibit a range of short-term, long-term, and complex dependencies.


Bayesian Heuristics for Robust Spatial Perception

arXiv.org Artificial Intelligence

Spatial perception is a key task in several machine intelligence applications such as robotics and computer vision. In general, it involves the nonlinear estimation of hidden variables that represent the system's state. However, in the presence of measurement outliers, the standard nonlinear least squared formulation results in poor estimates. Several methods have been considered in the literature to improve the reliability of the estimation process. Most methods are based on heuristics since guaranteed global robust estimation is not generally practical due to high computational costs. Recently general purpose robust estimation heuristics have been proposed that leverage existing non-minimal solvers available for the outlier-free formulations without the need for an initial guess. In this work, we propose three Bayesian heuristics that have similar structures. We evaluate these heuristics in practical scenarios to demonstrate their merits in different applications including 3D point cloud registration, mesh registration and pose graph optimization. The general computational advantages our proposals offer make them attractive candidates for spatial perception tasks.


Sensitivity Analysis in the Presence of Intrinsic Stochasticity for Discrete Fracture Network Simulations

arXiv.org Machine Learning

Large-scale discrete fracture network (DFN) simulators are standard fare for studies involving the sub-surface transport of particles since direct observation of real world underground fracture networks is generally infeasible. While these simulators have seen numerous successes over several engineering applications, estimations on quantities of interest (QoI) - such as breakthrough time of particles reaching the edge of the system - suffer from a two distinct types of uncertainty. A run of a DFN simulator requires several parameter values to be set that dictate the placement and size of fractures, the density of fractures, and the overall permeability of the system; uncertainty on the proper parameter choices will lead to some amount of uncertainty in the QoI, called epistemic uncertainty. Furthermore, since DFN simulators rely on stochastic processes to place fractures and govern flow, understanding how this randomness affects the QoI requires several runs of the simulator at distinct random seeds. The uncertainty in the QoI attributed to different realizations (i.e. different seeds) of the same random process leads to a second type of uncertainty, called aleatoric uncertainty. In this paper, we perform a Sensitivity Analysis, which directly attributes the uncertainty observed in the QoI to the epistemic uncertainty from each input parameter and to the aleatoric uncertainty. We make several design choices to handle an observed heteroskedasticity in DFN simulators, where the aleatoric uncertainty changes for different inputs, since the quality makes several standard statistical methods inadmissible. Beyond the specific takeaways on which input variables affect uncertainty the most for DFN simulators, a major contribution of this paper is the introduction of a statistically rigorous workflow for characterizing the uncertainty in DFN flow simulations that exhibit heteroskedasticity.