AITopics

1910.11703

Country:

North America > Canada > British Columbia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Overview (0.87)
Research Report (0.64)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Garner, Philip N., Tong, Sibo

A Bayesian Approach to Recurrence in Neural Networks

arXiv.org Machine LearningOct-24-2019

We begin by reiterating that common neural network activation functions have simple Bayesian origins. In this spirit, we go on to show that Bayes's theorem also implies a simple recurrence relation; this leads to a Bayesian recurrent unit with a prescribed feedback formulation. We show that introduction of a context indicator leads to a variable feedback that is similar to the forget mechanism in conventional recurrent units. A similar approach leads to a probabilistic input gate. The Bayesian formulation leads naturally to the two pass algorithm of the Kalman smoother or forward-backward algorithm, meaning that inference naturally depends upon future inputs as well as past ones. Experiments on speech recognition confirm that the resulting architecture can perform as well as a bidirectional recurrent network with the same number of parameters as a unidirectional one. Further, when configured explicitly bidirectionally, the architecture can exceed the performance of a conventional bidirectional recurrence.

architecture, recurrence, recursion, (15 more...)

1910.11247

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Hawaii (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)

Teymur, Onur, Filippi, Sarah

A Bayesian nonparametric test for conditional independence

arXiv.org Machine LearningOct-24-2019

This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses P olya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed by any previous procedure of this type.

conditional independence, independence, olya tree, (12 more...)

1910.11219

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Nguyen, Viet Anh, Shafieezadeh-Abadeh, Soroosh, Yue, Man-Chung, Kuhn, Daniel, Wiesemann, Wolfram

Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation

The likelihood function is a fundamental component in Bayesian statistics. However, evaluating the likelihood of an observation is computationally intractable in many applications. In this paper, we propose a non-parametric approximation of the likelihood that identifies a probability measure which lies in the neighborhood of the nominal measure and that maximizes the probability of observing the given sample point. We show that when the neighborhood is constructed by the Kullback-Leibler divergence, by moment conditions or by the Wasserstein distance, then our \textit{optimistic likelihood} can be determined through the solution of a convex optimization problem, and it admits an analytical expression in particular cases. We also show that the posterior inference problem with our optimistic likelihood approximation enjoys strong theoretical performance guarantees, and it performs competitively in a probabilistic classification task.

ambiguity, artificial intelligence, machine learning, (17 more...)

1910.10583

Country:

North America > United States (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre: Research Report (0.63)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Lee, Sangkyun, Sobczyk, Piotr, Bogdan, Malgorzata

Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control

In this paper, we propose a new estimation procedure for discovering the structure of Gaussian Markov random fields (MRFs) with false discovery rate (FDR) control, making use of the sorted l1-norm (SL1) regularization. A Gaussian MRF is an acyclic graph representing a multivariate Gaussian distribution, where nodes are random variables and edges represent the conditional dependence between the connected nodes. Since it is possible to learn the edge structure of Gaussian MRFs directly from data, Gaussian MRFs provide an excellent way to understand complex data by revealing the dependence structure among many inputs features, such as genes, sensors, users, documents, etc. In learning the graphical structure of Gaussian MRFs, it is desired to discover the actual edges of the underlying but unknown probabilistic graphical model-it becomes more complicated when the number of random variables (features) p increases, compared to the number of data points n. In particular, when p >> n, it is statistically unavoidable for any estimation procedure to include false edges. Therefore, there have been many trials to reduce the false detection of edges, in particular, using different types of regularization on the learning parameters. Our method makes use of the SL1 regularization, introduced recently for model selection in linear regression. We focus on the benefit of SL1 regularization that it can be used to control the FDR of detecting important random variables. Adapting SL1 for probabilistic graphical models, we show that SL1 can be used for the structure learning of Gaussian MRFs using our suggested procedure nsSLOPE (neighborhood selection Sorted L-One Penalized Estimation), controlling the FDR of detecting edges.

estimation, matrix, symmetry 2019, (14 more...)

doi: 10.3390/symxx010005

1910.1086

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
Europe > Spain > Andalusia > Cádiz Province > Cadiz (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Variational Predictive Information Bottleneck

Alemi, Alexander A.

In classic papers, Zellner demonstrated that Bayesian inference could be derived as the solution to an information theoretic functional. Below we derive a generalized form of this functional as a variational lower bound of a predictive information bottleneck objective. This generalized functional encompasses most modern inference procedures and suggests novel ones.

inference, information, representation, (10 more...)

1910.10831

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

Thakur, Sanjay, Van Hoof, Herke, Gupta, Gunshi, Meger, David

Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales

Neural Network based controllers hold enormous potential to learn complex, high-dimensional functions. However, they are prone to overfitting and unwarranted extrapolations. PAC Bayes is a generalized framework which is more resistant to overfitting and that yields performance bounds that hold with arbitrarily high probability even on the unjustified extrapolations. However, optimizing to learn such a function and a bound is intractable for complex tasks. In this work, we propose a method to simultaneously learn such a function and estimate performance bounds that scale organically to high-dimensions, non-linear environments without making any explicit assumptions about the environment. We build our approach on a parallel that we draw between the formulations called ELBO and PAC Bayes when the risk metric is negative log likelihood. Through our experiments on multiple high dimensional MuJoCo locomotion tasks, we validate the correctness of our theory, show its ability to generalize better, and investigate the factors that are important for its learning. The code for all the experiments is available at https://bit.ly/2qv0JjA.

experiment, formulation, neural network, (13 more...)

1910.10367

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > Canada > Quebec (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Setlur, Amrith, Póczós, Barnabás

Better Approximate Inference for Partial Likelihood Models with a Latent Structure

arXiv.org Machine LearningOct-22-2019

Temporal Point Processes (TPP) with partial likelihoods involving a latent structure often entail an intractable marginalization, thus making inference hard. We propose a novel approach to Maximum Likelihood Estimation (MLE) involving approximate inference over the latent variables by minimizing a tight upper bound on the approximation gap. Given a discrete latent variable $Z$, the proposed approximation reduces inference complexity from $O(|Z|^c)$ to $O(|Z|)$. We use convex conjugates to determine this upper bound in a closed form and show that its addition to the optimization objective results in improved results for models assuming proportional hazards as in Survival Analysis.

inequality, likelihood, objective, (11 more...)

1910.10211

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Romania > Nord-Vest Development Region > Bihor County > Oradea (0.04)
Europe > Netherlands > South Holland > Rotterdam (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.55)

Moreno-Muñoz, Pablo, Ramírez, David, Artés-Rodríguez, Antonio

Continual Learning for Infinite Hierarchical Change-Point Detection

arXiv.org Machine LearningOct-22-2019

Change-point detection (CPD) aims to locate abrupt transitions in the generative model of a sequence of observations. When Bayesian methods are considered, the standard practice is to infer the posterior distribution of the change-point locations. However, for complex models (high-dimensional or heterogeneous), it is not possible to perform reliable detection. To circumvent this problem, we propose to use a hierarchical model, which yields observations that belong to a lower-dimensional manifold. Concretely, we consider a latent-class model with an unbounded number of categories, which is based on the chinese-restaurant process (CRP). For this model we derive a continual learning mechanism that is based on the sequential construction of the CRP and the expectation-maximization (EM) algorithm with a stochastic maximization step. Our results show that the proposed method is able to recursively infer the number of underlying latent classes and perform CPD in a reliable manner.

change point, detection, map estimate, (15 more...)

1910.10087

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Zhu, Jie, Gallego, Blanca

Targeted Estimation of Heterogeneous Treatment Effect in Observational Survival Analysis

arXiv.org Machine LearningOct-22-2019

The aim of clinical effectiveness research using repositories of electronic health records is to identify what health interventions 'work best' in real-world settings. Since there are several reasons why the net benefit of intervention may differ across patients, current comparative effectiveness literature focuses on investigating heterogeneous treatment effect and predicting whether an individual might benefit from an intervention. The majority of this literature has concentrated on the estimation of the effect of treatment on binary outcomes. However, many medical interventions are evaluated in terms of their effect on future events, which are subject to loss to follow-up. In this study, we describe a framework for the estimation of heterogeneous treatment effect in terms of differences in time-to-event (survival) probabilities. We divide the problem into three phases: (1) estimation of treatment effect conditioned on unique sets of the covariate vector; (2) identification of features important for heterogeneity using an ensemble of non-parametric variable importance methods; and (3) estimation of treatment effect on the reference classes defined by the previously selected features, using one-step Targeted Maximum Likelihood Estimation. We conducted a series of simulation studies and found that this method performs well when either sample size or event rate is high enough and the number of covariates contributing to the effect heterogeneity is moderate. An application of this method to a clinical case study was conducted by estimating the effect of oral anticoagulants on newly diagnosed non-valvular atrial fibrillation patients using data from the UK Clinical Practice Research Datalink.

estimation, survival probability, treatment effect, (14 more...)

1910.08877

Country:

Europe > United Kingdom (0.14)
Oceania > Australia > New South Wales (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.94)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.88)
Health & Medicine > Health Care Technology > Medical Record (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)