AITopics | Gupta, Abhishek

Plotting

Gupta, Abhishek

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Canada Protocol: an ethical checklist for the use of Artificial Intelligence in Suicide Prevention and Mental Health

Mörch, Carl-Maria, Gupta, Abhishek, Mishara, Brian L.

arXiv.org Artificial IntelligenceJul-17-2019

Introduction: To improve current public health strategies in suicide prevention and mental health, governments, researchers and private companies increasingly use information and communication technologies, and more specifically Artificial Intelligence and Big Data. These technologies are promising but raise ethical challenges rarely covered by current legal systems. It is essential to better identify, and prevent potential ethical risks. Objectives: The Canada Protocol - MHSP is a tool to guide and support professionals, users, and researchers using AI in mental health and suicide prevention. Methods: A checklist was constructed based upon ten international reports on AI and ethics and two guides on mental health and new technologies. 329 recommendations were identified, of which 43 were considered as applicable to Mental Health and AI. The checklist was validated, using a two round Delphi Consultation. Results: 16 experts participated in the first round of the Delphi Consultation and 8 participated in the second round. Of the original 43 items, 38 were retained. They concern five categories: "Description of the Autonomous Intelligent System" (n=8), "Privacy and Transparency" (n=8), "Security" (n=6), "Health-Related Risks" (n=8), "Biases" (n=8). The checklist was considered relevant by most users, and could need versions tailored to each category of target users.

attention deficit hyperactivity disorder, checklist, health & medicine, (19 more...)

arXiv.org Artificial Intelligence

1907.07493

Country: North America > Canada > Quebec (0.16)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)

Add feedback

Learning latent state representation for speeding up exploration

Vezzani, Giulia, Gupta, Abhishek, Natale, Lorenzo, Abbeel, Pieter

arXiv.org Machine LearningMay-27-2019

Exploration is an extremely challenging problem in reinforcement learning, especially in high dimensional state and action spaces and when only sparse rewards are available. Effective representations can indicate which components of the state are task relevant and thus reduce the dimensionality of the space to explore. In this work, we take a representation learning viewpoint on exploration, utilizing prior experience to learn effective latent representations, which can subsequently indicate which regions to explore. Prior experience on separate but related tasks help learn representations of the state which are effective at predicting instantaneous rewards. These learned representations can then be used with an entropy-based exploration method to effectively perform exploration in high dimensional spaces by effectively lowering the dimensionality of the search space. We show the benefits of this representation for meta-exploration in a simulated object pushing environment.

artificial intelligence, exploration, machine learning, (17 more...)

arXiv.org Machine Learning

1905.12621

Country:

North America > United States > California (0.14)
Europe > Italy (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

Gupta, Abhishek, Tendolkar, Gaurav, Chen, Hao, Pi, Jianzong

arXiv.org Machine LearningApr-24-2019

Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximates certain contraction operators and can be viewed within the framework of iterated random maps. Accordingly, we consider iterated random maps over a Polish space that simulates a contraction operator over that Polish space. Assume that the iterated maps are indexed by $n$ such that as $n\rightarrow\infty$, each realization of the random map converges (in some sense) to the contraction map it is simulating. We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random maps converge weakly to the trajectory generated by the contraction operator. We further show that under certain conditions, the time average of the random sequence converge to the spatial mean of the invariant distribution. We then apply these results to logistic regression, empirical value iteration, empirical Q value iteration, and empirical relative value iteration for finite state finite action MDPs.

artificial intelligence, iteration, optimization problem, (19 more...)

arXiv.org Machine Learning

1904.10778

Country: North America > United States > Ohio (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Add feedback

Guided Meta-Policy Search

Mendonca, Russell, Gupta, Abhishek, Kralev, Rosen, Abbeel, Pieter, Levine, Sergey, Finn, Chelsea

arXiv.org Artificial IntelligenceApr-1-2019

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples because they learn from scratch. Meta-RL aims to address this challenge by leveraging experience from previous tasks in order to more quickly solve new tasks. However, in practice, these algorithms generally also require large amounts of on-policy experience during the meta-training process, making them impractical for use in many problems. To this end, we propose to learn a reinforcement learning procedure through imitation of expert policies that solve previously-seen tasks. This involves a nested optimization, with RL in the inner loop and supervised imitation learning in the outer loop. Because the outer loop imitation learning can be done with off-policy data, we can achieve significant gains in meta-learning sample efficiency. In this paper, we show how this general idea can be used both for meta-reinforcement learning and for learning fast RL procedures from multi-task demonstration data. The former results in an approach that can leverage policies learned for previous tasks without significant amounts of on-policy data during meta-training, whereas the latter is particularly useful in cases where demonstrations are easy for a person to provide. Across a number of continuous control meta-RL problems, we demonstrate significant improvements in meta-RL sample efficiency in comparison to prior work as well as the ability to scale to domains with visual observations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1904.00956

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AIR5: Five Pillars of Artificial Intelligence Research

Ong, Yew-Soon, Gupta, Abhishek

arXiv.org Artificial IntelligenceJan-2-2019

Currently, many (if not most) of the innovations in AI are driven by ML techniques centered around the use of so-called The original inspiration of artificial intelligence (AI) was to deep neural network (DNN) models [2]. The design of DNNs build autonomous systems capable of demonstrating humanlike is loosely based on the complex biological neural network that intelligence. Likewise, the related field of computational makes up a human brain - which (unsurprisingly) has drawn intelligence (CI) emerged in an attempt to artificially recreate significant interest among CI researchers as a dominant source the consummate learning and problem-solving ability depicted of intelligence in the natural world. However, DNNs are often by various natural phenomena - including the workings of the criticized for being highly opaque. It has been widely biological brain. However, in the present-day, the combined acknowledged that although these models can frequently attain effects of (i) the relatively easy access to massive and growing remarkable prediction accuracies, their layered nonlinear volumes of data, (ii) rapid increase in computational power, structure makes it exceeding difficult (if not impossible) to and (iii) the steady improvements in data-driven machine unambiguously interpret why a certain set of inputs leads to a learning (ML) algorithms [1, 2], have played a major role in particular output / prediction / decision. As a result, at least at helping modern AI systems vastly surpass humanly achievable present, these models have come to be viewed mainly as performance across a variety of applications.

deep learning, ground transportation, neural network, (18 more...)

arXiv.org Artificial Intelligence

1812.11509

Country: North America > United States > New York (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games (0.94)
Health & Medicine (0.71)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Meta-Reinforcement Learning of Structured Exploration Strategies

Gupta, Abhishek, Mendonca, Russell, Liu, YuXuan, Abbeel, Pieter, Levine, Sergey

Neural Information Processing SystemsDec-31-2018

Exploration is a fundamental challenge in reinforcement learning (RL). Many current exploration methods for deep RL use task-agnostic objectives, such as information gain or bonuses based on state visitation. However, many practical applications of RL involve learning more than a single task, and prior tasks can be used to inform how exploration should be performed in new tasks. In this work, we study how prior tasks can inform an agent about how to explore effectively in new situations. We introduce a novel gradient-based fast adaptation algorithm – model agnostic exploration with structured noise (MAESN) – to learn exploration strategies from prior experience. The prior experience is used both to initialize a policy and to acquire a latent exploration space that can inject structured stochasticity into a policy, producing exploration strategies that are informed by prior knowledge and are more effective than random action-space noise. We show that MAESN is more effective at learning exploration strategies when compared to prior meta-RL methods, RL without learned exploration strategies, and task-agnostic exploration methods. We evaluate our method on a variety of simulated tasks: locomotion with a wheeled robot, locomotion with a quadrupedal walker, and object manipulation.

artificial intelligence, reinforcement learning, upstream oil & gas, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Meta-Reinforcement Learning of Structured Exploration Strategies

Gupta, Abhishek, Mendonca, Russell, Liu, YuXuan, Abbeel, Pieter, Levine, Sergey

Neural Information Processing SystemsDec-31-2018

Exploration is a fundamental challenge in reinforcement learning (RL). Many current exploration methods for deep RL use task-agnostic objectives, such as information gain or bonuses based on state visitation. However, many practical applications of RL involve learning more than a single task, and prior tasks can be used to inform how exploration should be performed in new tasks. In this work, we study how prior tasks can inform an agent about how to explore effectively in new situations. We introduce a novel gradient-based fast adaptation algorithm - model agnostic exploration with structured noise (MAESN) - to learn exploration strategies fromprior experience. The prior experience is used both to initialize a policy and to acquire a latent exploration space that can inject structured stochasticity into a policy, producing exploration strategies that are informed by prior knowledge and are more effective than random action-space noise. We show that MAESN is more effective at learning exploration strategies when compared to prior meta-RL methods, RL without learned exploration strategies, and task-agnostic exploration methods. We evaluate our method on a variety of simulated tasks: locomotion with a wheeled robot, locomotion with a quadrupedal walker, and object manipulation.

artificial intelligence, exploration, upstream oil & gas, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California (0.14)
Europe > Austria > Vienna (0.14)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Soft Actor-Critic Algorithms and Applications

Haarnoja, Tuomas, Zhou, Aurick, Hartikainen, Kristian, Tucker, George, Ha, Sehoon, Tan, Jie, Kumar, Vikash, Zhu, Henry, Gupta, Abhishek, Abbeel, Pieter, Levine, Sergey

arXiv.org Artificial IntelligenceDec-12-2018

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity and brittleness to hyperparameters. Both of these challenges limit the applicability of such methods to real-world domains. In this paper, we describe Soft Actor-Critic (SAC), our recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework. In this framework, the actor aims to simultaneously maximize expected return and entropy. That is, to succeed at the task while acting as randomly as possible. We extend SAC to incorporate a number of modifications that accelerate training and improve stability with respect to the hyperparameters, including a constrained formulation that automatically tunes the temperature hyperparameter. We systematically evaluate SAC on a range of benchmark tasks, as well as real-world challenging tasks such as locomotion for a quadrupedal robot and robotic manipulation with a dexterous hand. With these improvements, SAC achieves state-of-the-art performance, outperforming prior on-policy and off-policy methods in sample-efficiency and asymptotic performance. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving similar performance across different random seeds. These results suggest that SAC is a promising candidate for learning in real-world robotics tasks.

algorithm, artificial intelligence, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1812.05905

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Learning Actionable Representations with Goal-Conditioned Policies

Ghosh, Dibya, Gupta, Abhishek, Levine, Sergey

arXiv.org Artificial IntelligenceNov-19-2018

Representation learning is a central challenge across a range of machine learning areas. In reinforcement learning, effective and functional representations have the potential to tremendously accelerate learning progress and solve more challenging problems. Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner. In this paper, we instead aim to learn functionally salient representations: representations that are not necessarily complete in terms of capturing all factors of variation in the observation space, but rather aim to capture those factors of variation that are important for decision making -- that are "actionable." These representations are aware of the dynamics of the environment, and capture only the elements of the observation that are necessary for decision making rather than all factors of variation, without explicit reconstruction of the observation. We show how these representations can be useful to improve exploration for sparse reward problems, to enable long horizon hierarchical reinforcement learning, and as a state representation for learning policies for downstream tasks. We evaluate our method on a number of simulated environments, and compare it to prior methods for representation learning, exploration, and hierarchical reinforcement learning.

artificial intelligence, neural network, representation, (19 more...)

arXiv.org Artificial Intelligence

1811.07819

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Guiding Policies with Language via Meta-Learning

Co-Reyes, John D., Gupta, Abhishek, Sanjeev, Suvansh, Altieri, Nick, DeNero, John, Abbeel, Pieter, Levine, Sergey

arXiv.org Artificial IntelligenceNov-19-2018

Behavioral skills or policies for autonomous agents are conventionally learned from reward functions, via reinforcement learning, or from demonstrations, via imitation learning. However, both modes of task specification have their disadvantages: reward functions require manual engineering, while demonstrations require a human expert to be able to actually perform the task in order to generate the demonstration. Instruction following from natural language instructions provides an appealing alternative: in the same way that we can specify goals to other humans simply by speaking or writing, we would like to be able to specify tasks for our machines. However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task. In this work, we propose an interactive formulation of the task specification problem, where iterative language corrections are provided to an autonomous agent, guiding it in acquiring the desired skill. Our proposed language-guided policy learning algorithm can integrate an instruction and a sequence of corrections to acquire new skills very quickly. In our experiments, we show that this method can enable a policy to follow instructions and corrections for simulated navigation and manipulation tasks, substantially outperforming direct, non-interactive instruction following.

correction, educational setting, neural network, (20 more...)

arXiv.org Artificial Intelligence

1811.07882

Country: North America > United States > California (0.14)

Genre: Research Report (0.70)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback