AITopics | Le, Tuan Anh

Plotting

Le, Tuan Anh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tighter Variational Bounds are Not Necessarily Better

Rainforth, Tom, Kosiorek, Adam R., Le, Tuan Anh, Maddison, Chris J., Igl, Maximilian, Wood, Frank, Teh, Yee Whye

arXiv.org Machine LearningJun-25-2018

We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio of the gradient estimator. Our results call into question common implicit assumptions that tighter ELBOs are better variational objectives for simultaneous model learning and inference amortization schemes. Based on our insights, we introduce three new algorithms: the partially importance weighted auto-encoder (PIWAE), the multiply importance weighted auto-encoder (MIWAE), and the combination importance weighted auto-encoder (CIWAE), each of which includes the standard importance weighted auto-encoder (IWAE) as a special case. We show that each can deliver improvements over IWAE, even when performance is measured by the IWAE target itself. Furthermore, our results suggest that PIWAE may be able to deliver simultaneous improvements in the training of both the inference and generative networks.

artificial intelligence, convergence, neural network, (19 more...)

arXiv.org Machine Learning

1802.04537

Country: Europe > Sweden (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Variational Reinforcement Learning for POMDPs

Igl, Maximilian, Zintgraf, Luisa, Le, Tuan Anh, Wood, Frank, Whiteson, Shimon

arXiv.org Machine LearningJun-6-2018

Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this paper, we propose deep variational reinforcement learning (DVRL), which introduces an inductive bias that allows an agent to learn a generative model of the environment and perform inference in that model to effectively aggregate the available information. We develop an n-step approximation to the evidence lower bound (ELBO), allowing the model to be trained jointly with the policy. This ensures that the latent state representation is suitable for the control task. In experiments on Mountain Hike and flickering Atari we show that our method outperforms previous approaches relying on recurrent neural networks to encode the past.

artificial intelligence, deep learning, neural network, (17 more...)

arXiv.org Machine Learning

1806.02426

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Instructional Material (0.68)
Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

Revisiting Reweighted Wake-Sleep

Le, Tuan Anh, Kosiorek, Adam R., Siddharth, N., Teh, Yee Whye, Wood, Frank

arXiv.org Machine LearningMay-26-2018

Discrete latent-variable models, while applicable in a variety of settings, can often be difficult to learn. Sampling discrete latent variables can result in high-variance gradient estimators for two primary reasons: 1. branching on the samples within the model, and 2. the lack of a pathwise derivative for the samples. While current state-of-the-art methods employ control-variate schemes for the former and continuous-relaxation methods for the latter, their utility is limited by the complexities of implementing and training effective control-variate schemes and the necessity of evaluating (potentially exponentially) many branch paths in the model. Here, we revisit the reweighted wake-sleep (RWS) (Bornschein and Bengio, 2015) algorithm, and through extensive evaluations, show that it circumvents both these issues, outperforming current state-of-the-art methods in learning discrete latent-variable models. Moreover, we observe that, unlike the importance weighted autoencoder, RWS learns better models and inference networks with increasing numbers of particles, and that its benefits extend to continuous latent-variable models as well. Our results suggest that RWS is a competitive, often preferable, alternative for learning deep generative models.

deep learning, inference network, neural network, (19 more...)

arXiv.org Machine Learning

1805.10469

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Bayesian Optimization for Probabilistic Programs

Rainforth, Tom, Le, Tuan Anh, van de Meent, Jan-Willem, Osborne, Michael A., Wood, Frank

arXiv.org Machine LearningJul-13-2017

We present the first general purpose framework for marginal maximum a posteriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction; delivering significant performance improvements over prominent existing packages. We present applications of our method to a number of tasks including engineering design and parameter optimization.

artificial intelligence, optimization, optimization problem, (17 more...)

arXiv.org Machine Learning

1707.04314

Genre: Research Report (0.64)

Industry: Energy (0.46)

Add feedback

Auto-Encoding Sequential Monte Carlo

Le, Tuan Anh, Igl, Maximilian, Jin, Tom, Rainforth, Tom, Wood, Frank

arXiv.org Machine LearningMay-29-2017

We introduce AESMC: a method for using deep neural networks for simultaneous model learning and inference amortization in a broad family of structured probabilistic models. Starting with an unlabeled dataset and a partially specified underlying generative model, AESMC refines the generative model and learns efficient proposal distributions for SMC for performing inference in this model. Our approach relies on 1) efficiency of SMC in performing inference in structured probabilistic models and 2) flexibility of deep neural networks to model complex conditional probability distributions. We demonstrate that our approach provides a fast, accurate, easy-to-implement, and scalable means for carrying out parameter estimation in high-dimensional statistical models as well as simultaneous model learning and proposal amortization in neural network based models.

deep learning, neural network, smc, (21 more...)

arXiv.org Machine Learning

1705.10306

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Using Synthetic Data to Train Neural Networks is Model-Based Reasoning

Le, Tuan Anh, Baydin, Atilim Gunes, Zinkov, Robert, Wood, Frank

arXiv.org Machine LearningMar-2-2017

We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.

deep learning, neural network, synthetic data, (15 more...)

arXiv.org Machine Learning

1703.00868

Country:

Europe (0.68)
North America > United States > Indiana (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Inference Compilation and Universal Probabilistic Programming

Le, Tuan Anh, Baydin, Atilim Gunes, Wood, Frank

arXiv.org Artificial IntelligenceMar-2-2017

We introduce a method for using deep neural networks to amortize the cost of inference in models from the family induced by universal probabilistic programming languages, establishing a framework that combines the strengths of probabilistic programming and deep learning methods. We call what we do "compilation of inference" because our method transforms a denotational specification of an inference problem in the form of a probabilistic program written in a universal programming language into a trained neural network denoted in a neural network specification language. When at test time this neural network is fed observational data and executed, it performs approximate inference in the original model specified by the probabilistic program. Our training objective and learning procedure are designed to allow the trained neural network to be used as a proposal distribution in a sequential importance sampling inference engine. We illustrate our method on mixture models and Captcha solving and show significant speedups in the efficiency of inference.

deep learning, inference, neural network, (16 more...)

arXiv.org Artificial Intelligence

1610.099

Country:

North America > United States (0.46)
Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bayesian Optimization for Probabilistic Programs

Rainforth, Tom, Le, Tuan Anh, Meent, Jan-Willem van de, Osborne, Michael A., Wood, Frank

Neural Information Processing SystemsDec-31-2016

We present the first general purpose framework for marginal maximum a posteriori estimationof probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction; delivering significant performance improvements over prominent existing packages.We present applications of our method to a number of tasks including engineering design and parameter optimization.

artificial intelligence, optimization, optimization problem, (18 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Data-driven Sequential Monte Carlo in Probabilistic Programming

Perov, Yura N, Le, Tuan Anh, Wood, Frank

arXiv.org Artificial IntelligenceMay-16-2016

Most of Markov Chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC) algorithms in existing probabilistic programming systems suboptimally use only model priors as proposal distributions. In this work, we describe an approach for training a discriminative model, namely a neural network, in order to approximate the optimal proposal by using posterior estimates from previous runs of inference. We show an example that incorporates a data-driven proposal for use in a non-parametric model in the Anglican probabilistic programming system. Our results show that data-driven proposals can significantly improve inference performance so that considerably fewer particles are necessary to perform a good posterior estimation.

artificial intelligence, neural network, proposal, (15 more...)

arXiv.org Artificial Intelligence

1512.04387

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback