Goto

Collaborating Authors

 doshi-velez




Towards Model-Agnostic Posterior Approximation for Fast and Accurate Variational Autoencoders

arXiv.org Machine Learning

Inference for Variational Autoencoders (VAEs) consists of learning two models: (1) a generative model, which transforms a simple distribution over a latent space into the distribution over observed data, and (2) an inference model, which approximates the posterior of the latent codes given data. The two components are learned jointly via a lower bound to the generative model's log marginal likelihood. In early phases of joint training, the inference model poorly approximates the latent code posteriors. Recent work showed that this leads optimization to get stuck in local optima, negatively impacting the learned generative model. As such, recent work suggests ensuring a high-quality inference model via iterative training: maximizing the objective function relative to the inference model before every update to the generative model. Unfortunately, iterative training is inefficient, requiring heuristic criteria for reverting from iterative to joint training for speed. Here, we suggest an inference method that trains the generative and inference models independently. It approximates the posterior of the true model a priori; fixing this posterior approximation, we then maximize the lower bound relative to only the generative model. By conventional wisdom, this approach should rely on the true prior and likelihood of the true model to approximate its posterior (which are unknown). However, we show that we can compute a deterministic, model-agnostic posterior approximation (MAPA) of the true model's posterior. We then use MAPA to develop a proof-of-concept inference method. We present preliminary results on low-dimensional synthetic data that (1) MAPA captures the trend of the true posterior, and (2) our MAPA-based inference performs better density estimation with less computation than baselines. Lastly, we present a roadmap for scaling the MAPA-based inference method to high-dimensional data.


Doshi-Velez

AAAI Conferences

The objective of my doctoral research is bring together two fields: partially-observable reinforcement learning (PORL) and non-parametric Bayesian statistics (NPB) to address issues of statistical modeling and decision-making in complex, real-world domains.


Explaining Reward Functions to Humans for Better Human-Robot Collaboration

arXiv.org Artificial Intelligence

Explainable AI techniques that describe agent reward functions can enhance human-robot collaboration in a variety of settings. One context where human understanding of agent reward functions is particularly beneficial is in the value alignment setting. In the value alignment context, an agent aims to infer a human's reward function through interaction so that it can assist the human with their tasks. If the human can understand where gaps exist in the agent's reward understanding, they will be able to teach more efficiently and effectively, leading to quicker human-agent team performance improvements. In order to support human collaborators in the value alignment setting and similar contexts, it is first important to understand the effectiveness of different reward explanation techniques in a variety of domains. In this paper, we introduce a categorization of information modalities for reward explanation techniques, suggest a suite of assessment techniques for human reward understanding, and introduce four axes of domain complexity. We then propose an experiment to study the relative efficacy of a broad set of reward explanation techniques covering multiple modalities of information in a set of domains of varying complexity.


Getting AI to work in a fleshy, messy world is harder than you think

#artificialintelligence

At the warehouses of British online grocery company Ocado Technology, robots, guided by AI, whizz around on rails at speeds of up to four metres per second, picking a 50-item order in minutes. The journeys then taken by Ocado's delivery trucks are optimised by a neural network that makes more than 14 million last-mile routing calculations per second, and adjusts delivery routes each time a customer places a new order or adds extra items to their shopping lists. But Ocado's most ambitious automation efforts involve packing robots. At the time of writing the company has five robotic picking arms powered by computer vision, and other machine-learning systems that can identify the products that need to be packed and use suction power to grab them. Further advances, undertaken in conjunction with two European academic-led projects, are in the pipeline. "From a human's perspective, it is a fairly simple task to pick and pack, and it doesn't require an awful lot of training," says Alex Harvey, chief of advanced technology at Ocado Technology. "For a computer and for a robot, the dexterous manipulation involved is far beyond the state of the art today to be able to pick and pack the full range of items that we do."


POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning

arXiv.org Machine Learning

Many medical decision-making settings can be framed as partially observed Markov decision processes (POMDPs). However, popular two-stage approaches that first learn a POMDP model and then solve it often fail because the model that best fits the data may not be the best model for planning. We introduce a new optimization objective that (a) produces both high-performing policies and high-quality generative models, even when some observations are irrelevant for planning, and (b) does so in the kinds of batch, off-policy settings common in medicine. We demonstrate our approach on synthetic examples and a real-world hypotension management task.


AI, Explain Yourself

Communications of the ACM

Artificial Intelligence (AI) systems are taking over a vast array of tasks that previously depended on human expertise and judgment. Often, however, the "reasoning" behind their actions is unclear, and can produce surprising errors or reinforce biased processes. One way to address this issue is to make AI "explainable" to humans--for example, designers who can improve it or let users better know when to trust it. Although the best styles of explanation for different purposes are still being studied, they will profoundly shape how future AI is used. Some explainable AI, or XAI, has long been familiar, as part of online recommender systems: book purchasers or movie viewers see suggestions for additional selections described as having certain similar attributes, or being chosen by similar users.


The State of AI Trajectory Magazine

#artificialintelligence

"Humans tend to overestimate technology in the short term but underestimate it in the long term," said Tom Foster, editor at large for Inc. magazine, during a panel he moderated on innovations in machine learning at South by Southwest (SXSW) in March. Artificial intelligence (AI) was a recurring theme across panels at SXSW 2018's Interactive Conference held in Austin, Texas. The topic was particularly popular in tracks titled "Intelligent Future" and "Startup & Tech Sectors." Many AI experts marveled at recent advances in the technology while pondering its future. "I've been working in AI for now more than 30 years and in the past eight years there are things that have occurred that I never thought would happen in my lifetime," said Adam Cheyer, co-founder of Viv Labs, during a discussion on innovations in AI.


Structured Variational Learning of Bayesian Neural Networks with Horseshoe Priors

arXiv.org Machine Learning

Bayesian Neural Networks (BNNs) have recently received increasing attention for their ability to provide well-calibrated posterior uncertainties. However, model selection---even choosing the number of nodes---remains an open question. Recent work has proposed the use of a horseshoe prior over node pre-activations of a Bayesian neural network, which effectively turns off nodes that do not help explain the data. In this work, we propose several modeling and inference advances that consistently improve the compactness of the model learned while maintaining predictive performance, especially in smaller-sample settings including reinforcement learning.