Goto

Collaborating Authors

 Bayesian Learning


A Probabilistic Disease Progression Model for Predicting Future Clinical Outcome

arXiv.org Machine Learning

In this work, we consider the problem of predicting the course of a progressive disease, such as cancer or Alzheimer's. Progressive diseases often start with mild symptoms that might precede a diagnosis, and each patient follows their own trajectory. Patient trajectories exhibit wild variability, which can be associated with many factors such as genotype, age, or sex. An additional layer of complexity is that, in real life, the amount and type of data available for each patient can differ significantly. For example, for one patient we might have no prior history, whereas for another patient we might have detailed clinical assessments obtained at multiple prior time-points. This paper presents a probabilistic model that can handle multiple modalities (including images and clinical assessments) and variable patient histories with irregular timings and missing entries, to predict clinical scores at future time-points. We use a sigmoidal function to model latent disease progression, which gives rise to clinical observations in our generative model. We implemented an approximate Bayesian inference strategy on the proposed model to estimate the parameters on data from a large population of subjects. Furthermore, the Bayesian framework enables the model to automatically fine-tune its predictions based on historical observations that might be available on the test subject. We applied our method to a longitudinal Alzheimer's disease dataset with more than 3000 subjects [23] and present a detailed empirical analysis of prediction performance under different scenarios, with comparisons against several benchmarks. We also demonstrate how the proposed model can be interrogated to glean insights about temporal dynamics in Alzheimer's disease.


SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning

arXiv.org Machine Learning

We present the Structural Agnostic Model (SAM), a framework to estimate end-to-end non-acyclic causal graphs from observational data. In a nutshell, SAM implements an adversarial game in which a separate model generates each variable, given real values from all others. In tandem, a discriminator attempts to distinguish between the joint distributions of real and generated samples. Finally, a sparsity penalty forces each generator to consider only a small subset of the variables, yielding a sparse causal graph. SAM scales easily to hundreds variables. Our experiments show the state-of-the-art performance of SAM on discovering causal structures and modeling interventions, in both acyclic and non-acyclic graphs.


Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning

arXiv.org Machine Learning

Deep neural networks (DNNs) enable innovative applications of machine learning like image recognition, machine translation, or malware detection. However, deep learning is often criticized for its lack of robustness in adversarial settings (e.g., vulnerability to adversarial inputs) and general inability to rationalize its predictions. In this work, we exploit the structure of deep learning to enable new learning-based inference and decision strategies that achieve desirable properties such as robustness and interpretability. We take a first step in this direction and introduce the Deep k-Nearest Neighbors (DkNN). This hybrid classifier combines the k-nearest neighbors algorithm with representations of the data learned by each layer of the DNN: a test input is compared to its neighboring training points according to the distance that separates them in the representations. We show the labels of these neighboring points afford confidence estimates for inputs outside the model's training manifold, including on malicious inputs like adversarial examples--and therein provides protections against inputs that are outside the models understanding. This is because the nearest neighbors can be used to estimate the nonconformity of, i.e., the lack of support for, a prediction in the training data. The neighbors also constitute human-interpretable explanations of predictions. We evaluate the DkNN algorithm on several datasets, and show the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures.


Simulation and Calibration of a Fully Bayesian Marked Multidimensional Hawkes Process with Dissimilar Decays

arXiv.org Machine Learning

We propose a simulation method for multidimensional Hawkes processes based on superposition theory of point processes. This formulation allows us to design efficient simulations for Hawkes processes with differing exponentially decaying intensities. We demonstrate that inter-arrival times can be decomposed into simpler auxiliary variables that can be sampled directly, giving exact simulation with no approximation. We establish that the auxiliary variables provides information on the parent process for each event time. The algorithm correctness is shown by verifying the simulated intensities with their theoretical moments. A modular inference procedure consisting of Gibbs samplers through the auxiliary variable augmentation and adaptive rejection sampling is presented. Finally, we compare our proposed simulation method against existing methods, and find significant improvement in terms of algorithm speed. Our inference algorithm is used to discover the strengths of mutually excitations in real dark networks. Keywords: Hawkes process, marked point process, exact simulation, Bayesian inference.


Large Scale Automated Forecasting for Monitoring Network Safety and Security

arXiv.org Machine Learning

As outlined in [1], leveraging big data and real time analytics constitute two main research venues for OR/MS in the analytics age. Information and communication technologies (ICT) have experienced an exponential growth in the last few decades and most human activities, businesses and devices strongly depend on ICT [2]. With the advent of the Internet of Things (IoT), this interrelation will become even more evident and change dramatically the way in which different components of business and service systems interact. In parallel, risks concerning the security of ICT systems are also growing, as pointed out e.g. in [3].


Estimating activity cycles with probabilistic methods I. Bayesian Generalised Lomb-Scargle Periodogram with Trend

arXiv.org Machine Learning

Period estimation is one of the central topics in astronomical time series analysis, where data is often unevenly sampled. Especially challenging are studies of stellar magnetic cycles, as there the periods looked for are of the order of the same length than the datasets themselves. The datasets often contain trends, the origin of which is either a real long-term cycle or an instrumental effect, but these effects cannot be reliably separated, while they can lead to erroneous period determinations if not properly handled. In this study we aim at developing a method that can handle the trends properly, and by performing extensive set of testing, we show that this is the optimal procedure when contrasted with methods that do not include the trend directly to the model. The effect of the form of the noise (whether constant or heteroscedastic) on the results is also investigated. We introduce a Bayesian Generalised Lomb-Scargle Periodogram with Trend (BGLST), which is a probabilistic linear regression model using Gaussian priors for the coefficients and uniform prior for the frequency parameter. We show, using synthetic data, that when there is no prior information on whether and to what extent the true model of the data contains a linear trend, the introduced BGLST method is preferable to the methods which either detrend the data or leave the data untrended before fitting the periodic model. Whether to use noise with different than constant variance in the model depends on the density of the data sampling as well as on the true noise type of the process.


Deep Bayesian Neural Networks. – Stefano Cosentino – Medium

#artificialintelligence

Conventional neural networks aren't well designed to model the uncertainty associated with the predictions they make. For that, one way is to go full Bayesian. What are we trying to do? Any deep network has parameters, often in the form of weights (w_1, w_2, …) and biases (b_1,b_2, …). The conventional (non-Bayesian) way is to learn only the optimal values via maximum likelihood estimation.


Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa

arXiv.org Machine Learning

This paper presents a generic Bayesian framework that enables any deep learning model to actively learn from targeted crowds. Our framework inherits from recent advances in Bayesian deep learning, and extends existing work by considering the targeted crowdsourcing approach, where multiple annotators with unknown expertise contribute an uncontrolled amount (often limited) of annotations. Our framework leverages the low-rank structure in annotations to learn individual annotator expertise, which then helps to infer the true labels from noisy and sparse annotations. It provides a unified Bayesian model to simultaneously infer the true labels and train the deep learning model in order to reach an optimal learning efficacy. Finally, our framework exploits the uncertainty of the deep learning model during prediction as well as the annotators' estimated expertise to minimize the number of required annotations and annotators for optimally training the deep learning model. We evaluate the effectiveness of our framework for intent classification in Alexa (Amazon's personal assistant), using both synthetic and real-world datasets. Experiments show that our framework can accurately learn annotator expertise, infer true labels, and effectively reduce the amount of annotations in model training as compared to state-of-the-art approaches. We further discuss the potential of our proposed framework in bridging machine learning and crowdsourcing towards improved human-in-the-loop systems.


Exact and approximate inference in graphical models: variable elimination and beyond

arXiv.org Artificial Intelligence

Probabilistic graphical models offer a powerful framework to account for the dependence structure between variables, which is represented as a graph. However, the dependence between variables may render inference tasks intractable. In this paper we review techniques exploiting the graph structure for exact inference, borrowed from optimisation and computer science. They are built on the principle of variable elimination whose complexity is dictated in an intricate way by the order in which variables are eliminated. The so-called treewidth of the graph characterises this algorithmic complexity: low-treewidth graphs can be processed efficiently. The first message that we illustrate is therefore the idea that for inference in graphical model, the number of variables is not the limiting factor, and it is worth checking for the treewidth before turning to approximate methods. We show how algorithms providing an upper bound of the treewidth can be exploited to derive a 'good' elimination order enabling to perform exact inference. The second message is that when the treewidth is too large, algorithms for approximate inference linked to the principle of variable elimination, such as loopy belief propagation and variational approaches, can lead to accurate results while being much less time consuming than Monte-Carlo approaches. We illustrate the techniques reviewed in this article on benchmarks of inference problems in genetic linkage analysis and computer vision, as well as on hidden variables restoration in coupled Hidden Markov Models.


Artificial Intelligence First - Disruption Hub

#artificialintelligence

Although materially beneficial corporate deployments of AI are beginning to proliferate, the AI activities of the majority still amount to a few isolated pilot projects conceived in an ad-hoc basis. Organisations without a clear AI strategy – and that's most – run the risk of falling behind as other better organised industry players move forward. That said, while individual AI solutions can be transformative within the scope of their application, that's not as clear-cut an argument for front-to-back change as, say, the digital transformation of a high street retailer. Developing an AI strategy requires an exercise of careful discrimination – acknowledging the present limitations of AI as well as its strengths in order to identify where one can, cannot, or even should not exploit it. This article is about the'what' of an AI strategy rather than the equally important'how'.