AITopics

1806.05355

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Texas > Dallas County > Richardson (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJun-13-2018

Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam

Khan, Mohammad Emtiyaz, Nielsen, Didrik, Tangkaratt, Voot, Lin, Wu, Gal, Yarin, Srivastava, Akash

Uncertainty computation in deep learning is essential to design robust and reliable systems. Variational inference (VI) is a promising approach for such computation, but requires more effort to implement and execute compared to maximum-likelihood methods. In this paper, we propose new natural-gradient algorithms to reduce such efforts for Gaussian mean-field VI. Our algorithms can be implemented within the Adam optimizer by perturbing the network weights during gradient evaluations, and uncertainty estimates can be cheaply obtained by using the vector that adapts the learning rate. This requires lower memory, computation, and implementation effort than existing VI methods, while obtaining uncertainty estimates of comparable quality. Our empirical results confirm this and further suggest that the weight-perturbation in our algorithm could be useful for exploration in reinforcement learning and stochastic optimization.

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1806.04854

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Sankaranarayanan, Sreecharan, Annasamy, Raghuram Mandyam, Sycara, Katia, Rosé, Carolyn Penstein

Combining Model-Free Q-Ensembles and Model-Based Approaches for Informed Exploration

arXiv.org Artificial IntelligenceJun-12-2018

Q-Ensembles are a model-free approach where input images are fed into different Q-networks and exploration is driven by the assumption that uncertainty is proportional to the variance of the output Q-values obtained. They have been shown to perform relatively well compared to other exploration strategies. Further, model-based approaches, such as encoder-decoder models have been used successfully for next frame prediction given previous frames. This paper proposes to integrate the model-free Q-ensembles and model-based approaches with the hope of compounding the benefits of both and achieving superior exploration as a result. Results show that a model-based trajectory memory approach when combined with Q-ensembles produces superior performance when compared to only using Q-ensembles.

deep learning, exploration, upstream oil & gas, (19 more...)

arXiv.org Artificial Intelligence

1806.04552

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Frogner, Charlie, Poggio, Tomaso

Approximate inference with Wasserstein gradient flows

We present a novel approximate inference method for diffusion processes, based on the Wasserstein gradient flow formulation of the diffusion. In this formulation, the time-dependent density of the diffusion is derived as the limit of implicit Euler steps that follow the gradients of a particular free energy functional. Existing methods for computing Wasserstein gradient flows rely on discretization of the domain of the diffusion, prohibiting their application to domains in more than several dimensions. We propose instead a discretization-free inference method that computes the Wasserstein gradient flow directly in a space of continuous functions. We characterize approximation properties of the proposed method and evaluate it on a nonlinear filtering task, finding performance comparable to the state-of-the-art for filtering diffusions.

artificial intelligence, diffusion, upstream oil & gas, (15 more...)

1806.04542

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

de Castro, Pablo, Dorigo, Tommaso

INFERNO: Inference-Aware Neural Optimisation

Complex computer simulations are commonly required for accurate data modelling in many scientific disciplines, making statistical inference challenging due to the intractability of the likelihood evaluation for the observed data. Furthermore, sometimes one is interested on inference drawn over a subset of the generative model parameters while taking into account model uncertainty or misspecification on the remaining nuisance parameters. In this work, we show how non-linear summary statistics can be constructed by minimising inference-motivated losses via stochastic gradient descent.

artificial intelligence, machine learning, nuisance parameter, (17 more...)

1806.04743

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

A Novel Bayesian Approach for Latent Variable Modeling from Mixed Data with Missing Values

Cui, Ruifei, Bucur, Ioan Gabriel, Groot, Perry, Heskes, Tom

We consider the problem of learning parameters of latent variable models from mixed (continuous and ordinal) data with missing values. We propose a novel Bayesian Gaussian copula factor (BGCF) approach that is consistent under certain conditions and that is quite robust to the violations of these conditions. In simulations, BGCF substantially outperforms two state-of-the-art alternative approaches. An illustration on the `Holzinger & Swineford 1939' dataset indicates that BGCF is favorable over the so-called robust maximum likelihood (MLR) even if the data match the assumptions of MLR.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1806.0461

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gong, Wenbo, Li, Yingzhen, Hernández-Lobato, José Miguel

Meta-Learning for Stochastic Gradient MCMC

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) has become increasingly popular for simulating posterior samples in large-scale Bayesian modeling. However, existing SG-MCMC schemes are not tailored to any specific probabilistic model, even a simple modification of the underlying dynamical system requires significant physical intuition. This paper presents the first meta-learning algorithm that allows automated design for the underlying continuous dynamics of an SG-MCMC sampler. The learned sampler generalizes Hamiltonian dynamics with state-dependent drift and diffusion, enabling fast traversal and efficient exploration of neural network energy landscapes. Experiments validate the proposed approach on both Bayesian fully connected neural network and Bayesian recurrent neural network tasks, showing that the learned sampler out-performs generic, hand-designed SG-MCMC algorithms, and generalizes to different datasets and larger architectures.

artificial intelligence, machine learning, sampler, (15 more...)

1806.04522

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

Casgrain, Philippe, Jaimungal, Sebastian

Trading algorithms with learning in latent alpha models

Alpha signals for statistical arbitrage strategies are often driven by latent factors. This paper analyses how to optimally trade with latent factors that cause prices to jump and diffuse. Moreover, we account for the effect of the trader's actions on quoted prices and the prices they receive from trading. Under fairly general assumptions, we demonstrate how the trader can learn the posterior distribution over the latent states, and explicitly solve the latent optimal trading problem. We provide a verification theorem, and a methodology for calibrating the model by deriving a variation of the expectation-maximization algorithm. To illustrate the efficacy of the optimal strategy, we demonstrate its performance through simulations and compare it to strategies which ignore learning in the latent factors. We also provide calibration results for a particular model using Intel Corporation stock as an example.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1806.04472

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Artificial IntelligenceJun-12-2018

Disintegration and Bayesian Inversion via String Diagrams

Cho, Kenta, Jacobs, Bart

The notions of disintegration and Bayesian inversion are fundamental in conditional probability theory. They produce channels, as conditional probabilities, from a joint state, or from an already given channel (in opposite direction). These notions exist in the literature, in concrete situations, but are presented here in abstract graphical formulations. The resulting abstract descriptions are used for proving basic results in conditional probability theory. The existence of disintegration and Bayesian inversion is discussed for discrete probability, and also for measure-theoretic probability --- via standard Borel spaces and via likelihoods. Finally, the usefulness of disintegration and Bayesian inversion is illustrated in several examples.

artificial intelligence, disintegration, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1709.00322

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mosser, Lukas, Dubrule, Olivier, Blunt, Martin J.

Stochastic seismic waveform inversion using generative adversarial networks as a geological prior

arXiv.org Machine LearningJun-10-2018

We present an application of deep generative models in the context of partial-differential equation (PDE) constrained inverse problems. We combine a generative adversarial network (GAN) representing an a priori model that creates subsurface geological structures and their petrophysical properties, with the numerical solution of the PDE governing the propagation of acoustic waves within the earth's interior. We perform Bayesian inversion using an approximate Metropolis-adjusted Langevin algorithm (MALA) to sample from the posterior given seismic observations. Gradients with respect to the model parameters governing the forward problem are obtained by solving the adjoint of the acoustic wave equation. Gradients of the mismatch with respect to the latent variables are obtained by leveraging the differentiable nature of the deep neural network used to represent the generative model. We show that approximate MALA sampling allows efficient Bayesian inversion of model parameters obtained from a prior represented by a deep generative model, obtaining a diverse set of realizations that reflect the observed seismic response.

deep learning, inversion, upstream oil & gas, (22 more...)

1806.0372

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)