AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Blind Spot Detection for Safe Sim-to-Real Transfer

Ramakrishnan, Ramya (Massachusetts Institute of Technology) | Kamar, Ece | Dey, Debadeepta | Horvitz, Eric | Shah, Julie

Journal of Artificial Intelligence ResearchFeb-4-2020

Agents trained in simulation may make errors when performing actions in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult for the agent to discover because the agent is unable to predict them a priori. In this work, we propose the use of oracle feedback to learn a predictive model of these blind spots in order to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: when the agent lacks necessary features to represent the true state of the world, and thus cannot distinguish between numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. Our system learns models for predicting blind spots within unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. These models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach across two domains and demonstrate that it achieves higher predictive performance than baseline methods, and also that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how these biases influence the discovery of blind spots. Further, we include analyses of our approach that incorporate relaxed initial optimality assumptions. (Interestingly, relaxing the assumptions of an optimal oracle and an optimal simulator policy helped our models to perform better.) We also propose extensions to our method that are intended to improve performance when using corrections and demonstrations data.

agent, blind spot, oracle, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11436

AI Access Foundation

11436

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Transportation (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(3 more...)

Add feedback

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Kaledin, Maxim, Moulines, Eric, Naumov, Alexey, Tadic, Vladislav, Wai, Hoi-To

arXiv.org Machine LearningFeb-4-2020

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is o (1 /k c) and the steady-state term is O (1 /k), where c 1 and k is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of Ω(1 /k). A simple numerical experiment is presented to support our theory. Keywords: stochastic approximation, reinforcement learning, GTD learning, Markovian noise 1. Introduction Since its introduction close to 70 years ago, the stochastic approximation (SA) scheme (Robbins and Monro, 1951) has been a powerful tool for root finding when only noisy samples are available. During the past two decades, considerable progresses in the practical and theoretical research of SA have been made, see (Bena ım, 1999; Kushner and Yin, 2003; Borkar, 2008) for an overview. Among others, linear SA schemes are popular in reinforcement learning (RL) as they lead to policy evaluation methods with linear function approximation, of particular importance is temporal difference (TD) learning (Sutton, 1988) for which finite time analysis has been reported in (Srikant and Ying, 2019; Lakshminarayanan and Szepesvari, 2018; Bhandari et al., 2018; Dalal et al., 2018a). The TD learning scheme based on classical (linear) SA is known to be inadequate for the off-policy learning paradigms in RL, where data samples are drawn from a behavior policy different from the policy being evaluated (Baird, 1995; Tsitsiklis and V an Roy, 1997). To circumvent this Authors listed in alphabetical order. These methods fall within the scope of linear two-timescale SA scheme introduced by Borkar (1997): θ k 1 θ k β k{null b 1( X k 1) null A 11(X k 1)θ k null A 12(X k 1) w k}, (1) w k 1 w k γ k{null b 2( X k 1) null A 21( X k 1)θ k null A 22(X k 1)w k}.

inequality, timescale sa, tochastic, (14 more...)

arXiv.org Machine Learning

2002.01268

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Asia > China > Hong Kong (0.04)
(3 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

tfp.mcmc: Modern Markov Chain Monte Carlo Tools Built for Modern Hardware

Lao, Junpeng, Suter, Christopher, Langmore, Ian, Chimisov, Cyril, Saxena, Ashish, Sountsov, Pavel, Moore, Dave, Saurous, Rif A., Hoffman, Matthew D., Dillon, Joshua V.

arXiv.org Machine LearningFeb-4-2020

Markov chain Monte Carlo (MCMC) is widely regarded as one of the most important algorithms of the 20th century. Its guarantees of asymptotic convergence, stability, and estimator-variance bounds using only unnormalized probability functions make it indispensable to probabilistic programming. In this paper, we introduce the TensorFlow Probability MCMC toolkit, and discuss some of the considerations that motivated its design.

chain state, mcmc, parallelism, (11 more...)

arXiv.org Machine Learning

2002.01184

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

DALC: Distributed Automatic LSTM Customization for Fine-Grained Traffic Speed Prediction

Lee, Ming-Chang, Lin, Jia-Chun

arXiv.org Machine LearningFeb-4-2020

Over the past decade, several approaches have been introduced for short - term traffic prediction. However, providing fine - grained traffic prediction for large - scale transportation networks where numerous detectors are geographically deployed to collect traf fic data is still an open issue. To address this issue, in this paper, we formulate the problem of customizing an LSTM model for a single detector into a finite Markov decision process and then introduce an A utomatic L STM C ustomization (ALC) algorithm to a utomatically customize an LSTM model for a single detector such that the corresponding prediction accuracy can be as satisfactory as possible and the time consumption can be as low as possible. Based on the ALC algorithm, we introduce a distributed approac h called D istributed A utomatic L STM C ustomization (DALC) to customize an LSTM model for every detector in large - scale transportation networks. Our experiment demonstrate s that the DALC provides higher prediction accuracy than several approaches provided by Apache Spark MLlib.

detector, lstm model, prediction accuracy, (12 more...)

arXiv.org Machine Learning

2001.09821

Country:

North America > United States > California (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
Europe > Norway (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (0.88)
Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Effectively Trainable Semi-Quantum Restricted Boltzmann Machine

Lyakhova, Ya. S., Polyakov, E. A., Rubtsov, A. N.

arXiv.org Machine LearningFeb-4-2020

We propose a novel quantum model for the restricted Boltzmann machine (RBM), in which the visible units remain classical whereas the hidden units are quantized as noninteracting fermions. The free motion of the fermions is parametrically coupled to the classical signal of the visible units. This model possesses a quantum behaviour such as coherences between the hidden units. Numerical experiments show that this fact makes it more powerful than the classical RBM with the same number of hidden units. At the same time, a significant advantage of the proposed model over the other approaches to the Quantum Boltzmann Machine (QBM) is that it is exactly solvable and efficiently trainable on a classical computer: there is a closed expression for the log-likelihood gradient with respect to its parameters. This fact makes it interesting not only as a model of a hypothetical quantum simulator, but also as a quantum-inspired classical machine-learning algorithm.

rbm, restricted boltzmann machine, sqrbm, (15 more...)

arXiv.org Machine Learning

2001.08997

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Russia (0.05)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Generating Digital Twins with Multiple Sclerosis Using Probabilistic Neural Networks

Walsh, Jonathan R., Smith, Aaron M., Pouliot, Yannick, Li-Bland, David, Loukianov, Anton, Fisher, Charles K.

arXiv.org Machine LearningFeb-3-2020

Multiple Sclerosis (MS) is a neurodegenerative disorder characterized by a complex set of clinical assessments. We use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to learn the relationships between covariates commonly used to characterize subjects and their disease progression in MS clinical trials. A CRBM is capable of generating digital twins, which are simulated subjects having the same baseline data as actual subjects. Digital twins allow for subject-level statistical analyses of disease progression. The CRBM is trained using data from 2395 subjects enrolled in the placebo arms of clinical trials across the three primary subtypes of MS. We discuss how CRBMs are trained and show that digital twins generated by the model are statistically indistinguishable from their actual subject counterparts along a number of measures.

covariate, crbm, digital twin, (15 more...)

arXiv.org Machine Learning

2002.02779

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England (0.04)
Europe > Poland > Lublin Province > Lublin (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

Bobu, Andreea, Bajcsy, Andrea, Fisac, Jaime F., Deglurkar, Sampada, Dragan, Anca D.

arXiv.org Machine LearningFeb-3-2020

Human input has enabled autonomous systems to improve their capabilities and achieve complex behaviors that are otherwise challenging to generate automatically. Recent work focuses on how robots can use such input - like demonstrations or corrections - to learn intended objectives. These techniques assume that the human's desired objective already exists within the robot's hypothesis space. In reality, this assumption is often inaccurate: there will always be situations where the person might care about aspects of the task that the robot does not know about. Without this knowledge, the robot cannot infer the correct objective. Hence, when the robot's hypothesis space is misspecified, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. In this paper, we posit that the robot should reason explicitly about how well it can explain human inputs given its hypothesis space and use that situational confidence to inform how it should incorporate human input. We demonstrate our method on a 7 degree-of-freedom robot manipulator in learning from two important types of human input: demonstrations of manipulation tasks, and physical corrections during the robot's task execution.

correction, demonstration, robot, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TRO.2020.2971415

2002.00941

Country:

North America > United States > California > Alameda County > Berkeley (0.28)
North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Torch-Struct: Deep Structured Prediction Library

Rush, Alexander M.

arXiv.org Machine LearningFeb-3-2020

The literature on structured prediction for NLP describes a rich collection of distributions and algorithms over sequences, segmentations, alignments, and trees; however, these algorithms are difficult to utilize in deep learning frameworks. We introduce Torch-Struct, a library for structured prediction designed to take advantage of and integrate with vectorized, auto-differentiation based frameworks. Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API that connects to any deep learning model. The library utilizes batched, vectorized operations and exploits auto-differentiation to produce readable, fast, and testable code. Internally, we also include a number of general-purpose optimizations to provide cross-algorithm efficiency. Experiments show significant performance gains over fast baselines and case-studies demonstrate the benefits of the library.

algorithm, prediction, torch-struct, (15 more...)

arXiv.org Machine Learning

2002.00876

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Add feedback

Automatic structured variational inference

Ambrogioni, Luca, Hinne, Max, van Gerven, Marcel

arXiv.org Machine LearningFeb-3-2020

The aim of probabilistic programming is to automatize every aspect of probabilistic inference in arbitrary probabilistic models (programs) so that the user can focus her attention on modeling, without dealing with ad-hoc inference methods. Gradient based automatic differentiation stochastic variational inference offers an attractive option as the default method for (differentiable) probabilistic programming as it combines high performance with high computational efficiency. However, the performance of any (parametric) variational approach depends on the choice of an appropriate variational family. Here, we introduced a fully automatic method for constructing structured variational families inspired to the closed-form update in conjugate models. These pseudo-conjugate families incorporate the forward pass of the input probabilistic program and can capture complex statistical dependencies. Pseudo-conjugate families have the same space and time complexity of the input probabilistic program and are therefore tractable in a very large class of models. We validate our automatic variational method on a wide range of high dimensional inference problems including deep learning components.

probabilistic program, variational family, variational inference, (14 more...)

arXiv.org Machine Learning

2002.00643

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
(2 more...)

Add feedback

Deep Learning (Interview With Dong Yu)

#artificialintelligenceFeb-2-2020, 16:13:09 GMT

Dr. Dong Yu is a principal researcher at Microsoft Research. His research has been focusing on speech recognition and applications of machine learning techniques. He has published two monographs and over 150 papers in these areas and is the inventor/co-inventor of near 60 granted/pending patents. His recent work on the context-dependent deep neural network hidden Markov model (CD-DNN-HMM), which was recognized by the IEEE SPS 2013 best paper award, caused a paradigm shift on large vocabulary speech recognition. Dr. Dong Yu is currently serving as a member of the IEEE Speech and Language Processing Technical Committee (2013-).

application, deep learning, speech recognition, (9 more...)

#artificialintelligence

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Personal > Interview (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Add feedback