AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Deep Kalman Filters

Krishnan, Rahul G., Shalit, Uri, Sontag, David

arXiv.org Machine LearningNov-25-2015

Kalman Filters are one of the most influential models of time-varying phenomena. They admit an intuitive probabilistic interpretation, have a simple functional form, and enjoy widespread adoption in a variety of disciplines. Motivated by recent variational methods for learning deep generative models, we introduce a unified algorithm to efficiently learn a broad spectrum of Kalman filters. Of particular interest is the use of temporal generative models for counterfactual inference. We investigate the efficacy of such models for counterfactual inference, and to that end we introduce the "Healing MNIST" dataset where long-term structure, noise and actions are applied to sequences of digits. We show the efficacy of our method for modeling this dataset. We further show how our model can be used for counterfactual inference for patients, based on electronic health record data of 8,000 patients over 4.5 years.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Machine Learning

1511.05121

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.72)
Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Causal inference using invariant prediction: identification and confidence intervals

Peters, Jonas, Bühlmann, Peter, Meinshausen, Nicolai

arXiv.org Artificial IntelligenceNov-24-2015

What is the difference of a prediction that is made with a causal model and a non-causal model? Suppose we intervene on the predictor variables or change the whole environment. The predictions from a causal model will in general work as well under interventions as for observational data. In contrast, predictions from a non-causal model can potentially be very wrong if we actively intervene on variables. Here, we propose to exploit this invariance of a prediction under a causal model for causal inference: given different experimental settings (for example various interventions) we collect all models that do show invariance in their predictive accuracy across settings and interventions. The causal model will be a member of this set of models with high probability. This approach yields valid confidence intervals for the causal relationships in quite general scenarios. We examine the example of structural equation models in more detail and provide sufficient assumptions under which the set of causal predictors becomes identifiable. We further investigate robustness properties of our approach under model misspecification and discuss possible extensions. The empirical properties are studied for various data sets, including large-scale gene perturbation experiments.

assumption, causal predictor, intervention, (15 more...)

arXiv.org Artificial Intelligence

1501.01332

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Maximum Likelihood Estimation for Single Linkage Hierarchical Clustering

Zhu, Dekang, Guralnik, Dan P., Wang, Xuezhi, Li, Xiang, Moran, Bill

arXiv.org Machine LearningNov-24-2015

Clustering is a common technique for statistical data analysis, widely used in data mining, machine learning, pattern recognition, image analysis, bioinformatics and cyber security. Conventional ("flat", "hard") clustering methods accept a finite metric space (O, d) as input and return a partition of O as their output. Hierarchical clustering (HC) methods have a different philosophy: their output is an entire hierarchy of partitions, called a dendrogram, capable of exhibiting multi-scale structure in the data set [1, 2]. Rather than fixing the required number of clusters in advance, as is common for many flat clustering algorithms, it is more informative to furnish a hierarchy of clusters, providing an opportunity to choose a partition at a scale most natural for the context of the task at hand. Many HC methods require linkage functions to provide a measure of dissimilarity between clusters (see [3, 4] for a fairly recent review). Some commonly used linkage functions are single linkage, complete linkage, average linkage, etc. The SLHC method, though suffering from the so called "chaining effect", remains popular for large scale applications [5] because of the low complexity of implementing it using minimum spanning trees (MST) [6].

artificial intelligence, dendrogram, machine learning, (15 more...)

arXiv.org Machine Learning

1511.07944

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Natural Language Understanding with Distributed Representation

Cho, Kyunghyun

arXiv.org Machine LearningNov-24-2015

As the name of the course suggests, this lecture note introduces readers to a neural network based approach to natural language understanding/processing. In order to make it as self-contained as possible, I spend much time on describing basics of machine learning and neural networks, only after which how they are used for natural languages is introduced. On the language front, I almost solely focus on language modelling and machine translation, two of which I personally find most fascinating and most fundamental to natural language understanding. After about a month of lectures and about 40 pages of writing this lecture note, I found this fascinating note [47] by Yoav Goldberg on neural network models for natural language processing. This note deals with wider topics on natural language processing with distributed representations in more details, and I highly recommend you to read it (hopefully along with this lecture note.)

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1511.07916

Country:

Europe (1.00)
North America > United States > Texas (0.27)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education (1.00)
Government > Military (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Private Posterior distributions from Variational approximations

Karwa, Vishesh, Kifer, Dan, Slavković, Aleksandra B.

arXiv.org Machine LearningNov-24-2015

Privacy preserving mechanisms such as differential privacy inject additional randomness in the form of noise in the data, beyond the sampling mechanism. Ignoring this additional noise can lead to inaccurate and invalid inferences. In this paper, we incorporate the privacy mechanism explicitly into the likelihood function by treating the original data as missing, with an end goal of estimating posterior distributions over model parameters. This leads to a principled way of performing valid statistical inference using private data, however, the corresponding likelihoods are intractable. In this paper, we derive fast and accurate variational approximations to tackle such intractable likelihoods that arise due to privacy. We focus on estimating posterior distributions of parameters of the naive Bayes log-linear model, where the sufficient statistics of this model are shared using a differentially private interface. Using a simulation study, we show that the posterior approximations outperform the naive method of ignoring the noise addition mechanism.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1511.07896

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.89)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Add feedback

Searching for Objects using Structure in Indoor Scenes

Nagaraja, Varun K., Morariu, Vlad I., Davis, Larry S.

arXiv.org Artificial IntelligenceNov-24-2015

To identify the location of objects of a particular class, a passive computer vision system generally processes all the regions in an image to finally output few regions. However, we can use structure in the scene to search for objects without processing the entire image. We propose a search technique that sequentially processes image regions such that the regions that are more likely to correspond to the query class object are explored earlier. We frame the problem as a Markov decision process and use an imitation learning algorithm to learn a search strategy. Since structure in the scene is essential for search, we work with indoor scene images as they contain both unary scene context information and object-object context in the scene. We perform experiments on the NYU-depth v2 dataset and show that the unary scene context features alone can achieve a significantly high average precision while processing only 20-25\% of the regions for classes like bed and sofa. By considering object-object context along with the scene context features, the performance is further improved for classes like counter, lamp, pillow and sofa.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.5244/C.29.53

1511.0771

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback

Stick-Breaking Policy Learning in Dec-POMDPs

Liu, Miao, Amato, Christopher, Liao, Xuejun, Carin, Lawrence, How, Jonathan P.

arXiv.org Artificial IntelligenceNov-23-2015

Expectation maximization (EM) has recently been shown to be an efficient algorithm for learning finite-state controllers (FSCs) in large decentralized POMDPs (Dec-POMDPs). However, current methods use fixed-size FSCs and often converge to maxima that are far from optimal. This paper considers a variable-size FSC to represent the local policy of each agent. These variable-size FSCs are constructed using a stick-breaking prior, leading to a new framework called \emph{decentralized stick-breaking policy representation} (Dec-SBPR). This approach learns the controller parameters with a variational Bayesian algorithm without having to assume that the Dec-POMDP model is available. The performance of Dec-SBPR is demonstrated on several benchmark problems, showing that the algorithm scales to large problems while outperforming other state-of-the-art methods.

artificial intelligence, dec-sbpr, upstream oil & gas, (16 more...)

arXiv.org Artificial Intelligence

1505.00274

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Germany (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.47)
Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Functional Gaussian Process Model for Bayesian Nonparametric Analysis

Duan, Leo L., Wang, Xia, Szczesniak, Rhonda D.

arXiv.org Machine LearningNov-23-2015

Gaussian process is a theoretically appealing model for nonparametric analysis, but its computational cumbersomeness hinders its use in large scale and the existing reduced-rank solutions are usually heuristic. In this work, we propose a novel construction of Gaussian process as a projection from fixed discrete frequencies to any continuous location. This leads to a valid stochastic process that has a theoretic support with the reduced rank in the spectral density, as well as a high-speed computing algorithm. Our method provides accurate estimates for the covariance parameters and concise form of predictive distribution for spatial prediction. For non-stationary data, we adopt the mixture framework with a customized spectral dependency structure. This enables clustering based on local stationarity, while maintains the joint Gaussianness. Our work is directly applicable in solving some of the challenges in the spatial data, such as large scale computation, anisotropic covariance, spatio-temporal modeling, etc. We illustrate the uses of the model via simulations and an application on a massive dataset.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Machine Learning

1502.03042

Country: North America (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Black box variational inference for state space models

Archer, Evan, Park, Il Memming, Buesing, Lars, Cunningham, John, Paninski, Liam

arXiv.org Machine LearningNov-23-2015

Latent variable time-series models are among the most heavily used tools from machine learning and applied statistics. These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time. A few highly-structured models, such as the linear dynamical system with linear-Gaussian observations, have closed-form inference procedures (e.g. the Kalman Filter), but this case is an exception to the general rule that exact posterior inference in more complex generative models is intractable. Consequently, much work in time-series modeling focuses on approximate inference procedures for one particular class of models. Here, we extend recent developments in stochastic variational inference to develop a `black-box' approximate inference technique for latent variable models with latent dynamical structure. We propose a structured Gaussian variational approximate posterior that carries the same intuition as the standard Kalman filter-smoother but, importantly, permits us to use the same inference approach to approximate the posterior of much more general, nonlinear latent variable generative models. We show that our approach recovers accurate estimates in the case of basic models with closed-form posteriors, and more interestingly performs well in comparison to variational approaches that were designed in a bespoke fashion for specific non-conjugate models.

artificial intelligence, machine learning, posterior, (18 more...)

arXiv.org Machine Learning

1511.07367

Country: North America > United States > New York (0.32)

Genre: Research Report (0.40)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Bayesian Evidence and Model Selection

Knuth, Kevin H., Habeck, Michael, Malakar, Nabin K., Mubeen, Asim M., Placek, Ben

arXiv.org Machine LearningNov-23-2015

In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Specific attention is paid to the Laplace approximation, variational Bayes, importance sampling, thermodynamic integration, and nested sampling and its recent variants. Analogies to statistical physics, from which many of these techniques originate, are discussed in order to provide readers with deeper insights that may lead to new techniques. The utility of Bayesian model testing in the domain sciences is demonstrated by presenting four specific practical examples considered within the context of signal processing in the areas of signal detection, sensor characterization, scientific model selection and molecular force characterization.

artificial intelligence, likelihood, machine learning, (20 more...)

arXiv.org Machine Learning

doi: 10.1016/j.dsp.2015.06.012

1411.3013

Country:

Europe > Germany (0.68)
North America > United States > New York (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > Canada > Ontario (0.28)

Genre:

Research Report (0.63)
Overview (0.48)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback