Goto

Collaborating Authors

 Bayesian Inference


Tensor Basis Gaussian Process Models of Hyperelastic Materials

arXiv.org Machine Learning

In this work, we develop Gaussian process regression (GPR) models of hyperelastic material behavior. First, we consider the direct approach of modeling the components of the Cauchy stress tensor as a function of the components of the Finger stretch tensor in a Gaussian process. We then consider an improvement on this approach that embeds rotational invariance of the stress-stretch constitutive relation in the GPR representation. This approach requires fewer training examples and achieves higher accuracy while maintaining invariance to rotations exactly. Finally, we consider an approach that recovers the strain-energy density function and derives the stress tensor from this potential. Although the error of this model for predicting the stress tensor is higher, the strain-energy density is recovered with high accuracy from limited training data. The approaches presented here are examples of physics-informed machine learning. They go beyond purely data-driven approaches by embedding the physical system constraints directly into the Gaussian process representation of materials models.


Blang: Bayesian declarative modelling of arbitrary data structures

arXiv.org Machine Learning

Consider a Bayesian inference problem where a variable of interest does not take values in a Euclidean space. These "non-standard" data structures are in reality fairly common. They are frequently used in problems involving latent discrete factor models, networks, and domain specific problems such as sequence alignments and reconstructions, pedigrees, and phylogenies. In principle, Bayesian inference should be particularly well-suited in such scenarios, as the Bayesian paradigm provides a principled way to obtain confidence assessment for random variables of any type. However, much of the recent work on making Bayesian analysis more accessible and computationally efficient has focused on inference in Euclidean spaces. In this paper, we introduce Blang, a domain specific language (DSL) and library aimed at bridging this gap. Blang allows users to perform Bayesian analysis on arbitrary data types while using a declarative syntax similar to BUGS. Blang is augmented with intuitive language additions to invent data types of the user's choosing. To perform inference at scale on such arbitrary state spaces, Blang leverages recent advances in parallelizable, non-reversible Markov chain Monte Carlo methods.


Teaching robots to perceive time -- A reinforcement learning approach (Extended version)

arXiv.org Artificial Intelligence

Time perception is the phenomenological experience of time by an individual. In this paper, we study how to replicate neural mechanisms involved in time perception, allowing robots to take a step towards temporal cognition. Our framework follows a twofold biologically inspired approach. The first step consists of estimating the passage of time from sensor measurements, since environmental stimuli influence the perception of time. Sensor data is modeled as Gaussian processes that represent the second-order statistics of the natural environment. The estimated elapsed time between two events is computed from the maximum likelihood estimate of the joint distribution of the data collected between them. Moreover, exactly how time is encoded in the brain remains unknown, but there is strong evidence of the involvement of dopaminergic neurons in timing mechanisms. Since their phasic activity has a similar behavior to the reward prediction error of temporal-difference learning models, the latter are used to replicate this behavior. The second step of this approach consists therefore of applying the agent's estimate of the elapsed time in a reinforcement learning problem, where a feature representation called Microstimuli is used. We validate our framework by applying it to an experiment that was originally conducted with mice, and conclude that a robot using this framework is able to reproduce the timing mechanisms of the animal's brain.


Sum-Product Network Decompilation

arXiv.org Artificial Intelligence

There exists a dichotomy between classical probabilistic graphical models, such as Bayesian networks (BNs), and modern tractable models, such as sum-product networks (SPNs). The former have generally intractable inference, but allow a high level of interpretability, while the latter admits a wide range of tractable inference routines, but are typically harder to interpret. Due to this dichotomy, tools to convert between BNs and SPNs are desirable. While one direction -- compiling BNs into SPNs -- is well discussed in Darwiche's seminal work on arithmetic circuit compilation, the converse direction -- decompiling SPNs into BNs -- has received surprisingly little attention. In this paper, we fill this gap by proposing SPN2BN, an algorithm that decompiles an SPN into a BN. SPN2BN has several salient features when compared to the only other two works decompiling SPNs. Most significantly, the BNs returned by SPN2BN are minimal independence-maps. Secondly, SPN2BN is more parsimonious with respect to the introduction of latent variables. Thirdly, the output BN produced by SPN2BN can be precisely characterized with respect to the compiled BN. More specifically, a certain set of directed edges will be added to the input BN, giving what we will call the moral-closure. It immediately follows that there is a set of BNs related to the input BN that will also return the same moral closure. Lastly, it is established that our compilation-decompilation process is idempotent. We confirm our results with systematic experiments on a number of synthetic BNs.


Contextual Outlier Detection in Continuous-Time Event Sequences

arXiv.org Machine Learning

Continuous-time event sequences represent discrete events occurring in continuous time. Such sequences arise frequently in real-life and cover a wide variety of natural events, such as earthquakes, or events corresponding to human actions, such as medical administrations. Usually we expect the event sequences to follow some regular pattern over time. However, sometimes these regular patterns may be interrupted by unexpected absence or unexpected occurrences of events. Identification of these unexpected cases can be very important as they may point to abnormal situations that need human attention. In this work, we study and develop methods for detecting outliers in continuous-time event sequences, including unexpected absence and unexpected occurrences of events. Since the patterns that event sequences tend to follow may change in different contexts, we develop outlier detection methods based on point processes that take into account different contexts. Our outlier scoring methods are based on Bayesian decision theory and hypothesis testing with theoretical guarantees. To test the performance of the methods, we conduct experiments on both synthetic data and real-world clinical data and show the effectiveness of the proposed methods.


Normalizing flows for deep anomaly detection

arXiv.org Machine Learning

In this work, we consider cases with missing certain kinds of anomalies in the training dataset, while significant statistics for the normal class is available. For such scenarios, conventional supervised methods might suffer from the class imbalance, while unsupervised methods tend to ignore difficult anomalous examples. We extend the idea of the supervised classification approach for class-imbalanced datasets by exploiting normalizing flows for proper Bayesian inference of the posterior probabilities. Index Terms --Machine Learning, Neural Nets, Anomaly Detection, Imbalanced Data Set, Generate Potential Outliers, Normalizing Flow null 1 I NTRODUCTION The anomaly detection problem is one of the important tasks in the analysis of real-world data. Possible applications range from the data-quality certification [1] to finding the rare specific cases of the diseases in medicine [2].


A Bayesian Approach to Modelling Longitudinal Data in Electronic Health Records

arXiv.org Machine Learning

Analyzing electronic health records (EHR) poses significant challenges because often few samples are available describing a patient's health and, when available, their information content is highly diverse. The problem we consider is how to integrate sparsely sampled longitudinal data, missing measurements informative of the underlying health status and fixed demographic information to produce estimated survival distributions updated through a patient's follow up. We propose a nonparametric probabilistic model that generates survival trajectories from an ensemble of Bayesian trees that learns variable interactions over time without specifying beforehand the longitudinal process. We show performance improvements on Primary Biliary Cirrhosis patient data.


Interactive Open-Ended Learning for 3D Object Recognition

arXiv.org Artificial Intelligence

The thesis contributes in several important ways to the research area of 3D object category learning and recognition. To cope with the mentioned limitations, we look at human cognition, in particular at the fact that human beings learn to recognize object categories ceaselessly over time. This ability to refine knowledge from the set of accumulated experiences facilitates the adaptation to new environments. Inspired by this capability, we seek to create a cognitive object perception and perceptual learning architecture that can learn 3D object categories in an open-ended fashion. In this context, ``open-ended'' implies that the set of categories to be learned is not known in advance, and the training instances are extracted from actual experiences of a robot, and thus become gradually available, rather than being available since the beginning of the learning process. In particular, this architecture provides perception capabilities that will allow robots to incrementally learn object categories from the set of accumulated experiences and reason about how to perform complex tasks. This framework integrates detection, tracking, teaching, learning, and recognition of objects. An extensive set of systematic experiments, in multiple experimental settings, was carried out to thoroughly evaluate the described learning approaches. Experimental results show that the proposed system is able to interact with human users, learn new object categories over time, as well as perform complex tasks. The contributions presented in this thesis have been fully implemented and evaluated on different standard object and scene datasets and empirically evaluated on different robotic platforms.


Bayesian high-dimensional linear regression with generic spike-and-slab priors

arXiv.org Machine Learning

Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous works on theoretical properties of spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thus can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously assess their theoretical properties. Technically, we provide general conditions under which generic spike-and-slab priors can achieve a nearly-optimal posterior contraction rate and model selection consistency. Our results include those of Castillo et al. (2015) and Narisetty and He (2014) as special cases.


Continuous Meta-Learning without Tasks

arXiv.org Machine Learning

However, there are several practical considerations in the choice of meta-learning algorithm which can influence the computational efficiency and overall performance of MOCA. For the experiments in this paper, we leverage two meta-learning algorithms which offer a clean Bayesian learning interpretation, relatively low-dimensional posterior statistics, recursive updates for these statistics, and computationally efficient likelihood evaluation under the posterior predictive. For regression experiments, we use ALPaCA (Harrison et al., 2018); for classification experiments, we use a novel algorithm based on similar Bayesian updates which we refer to as PCOC, for probabilistic clustering for online classification. For completeness, we offer a high level overview of these algorithms and show how they fit into the MOCA framework in the following subsections.