AITopics | Bobu, Andreea

Collaborating Authors

Bobu, Andreea

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Inducing Structure in Reward Learning by Learning Features

Bobu, Andreea, Wiggert, Marius, Tomlin, Claire, Dragan, Anca D.

arXiv.org Artificial IntelligenceJan-18-2022

In doing so, however, these approaches sacrifice the sample efficiency and generalizability that a well-specified feature Whether it's semi-autonomous driving (Sadigh et al. 2016), set offers. While using an expressive function approximator recommender systems (Ziebart et al. 2008), or household to extract features and learn their reward combination at once robots working in close proximity with people (Jain et al. seems advantageous, many such functions can induce policies 2015), reward learning can greatly benefit autonomous agents that explain the demonstrations. Hence, to disambiguate to generate behaviors that adapt to new situations or human between all these candidate functions, the robot requires a preferences. Under this framework, the robot uses the person's very large amount of (laborious to collect) data, and this data input to learn a reward function that describes how they prefer needs to be diverse enough to identify the true reward. For the task to be performed. For instance, in the scenario in Fig. example, the human in the household robot setting in Figure 1 1, the human wants the robot to keep the cup away from the might want to demonstrate keeping the cup away from the laptop to prevent spilling liquid over it; she may communicate laptop, but from a single demonstration the robot could find this preference to the robot by providing a demonstration of many other explanations for the person's behavior: perhaps the task or even by directly intervening during the robot's task they always happened to keep the cup upright or they really execution to correct it.

artificial intelligence, demonstration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2201.07082

Country: North America > United States > California (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Education (0.67)
Transportation (0.48)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Perceptual Concepts by Bootstrapping from Human Queries

Bobu, Andreea, Paxton, Chris, Yang, Wei, Sundaralingam, Balakumar, Chao, Yu-Wei, Cakmak, Maya, Fox, Dieter

arXiv.org Artificial IntelligenceNov-9-2021

Robots need to be able to learn concepts from their users in order to adapt their capabilities to each user's unique task. But when the robot operates on high-dimensional inputs, like images or point clouds, this is impractical: the robot needs an unrealistic amount of human effort to learn the new concept. To address this challenge, we propose a new approach whereby the robot learns a low-dimensional variant of the concept and uses it to generate a larger data set for learning the concept in the high-dimensional space. This lets it take advantage of semantically meaningful privileged information only accessible at training time, like object poses and bounding boxes, that allows for richer human interaction to speed up learning. We evaluate our approach by learning prepositional concepts that describe object state or multi-object relationships, like above, near, or aligned, which are key to user specification of task goals and execution constraints for robots. Using a simulated human, we show that our approach improves sample complexity when compared to learning concepts directly in the high-dimensional space. We also demonstrate the utility of the learned concepts in motion planning tasks on a 7-DoF Franka Panda robot.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2111.05251

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

Dynamically Switching Human Prediction Models for Efficient Planning

Sripathy, Arjun, Bobu, Andreea, Brown, Daniel S., Dragan, Anca D.

arXiv.org Artificial IntelligenceMar-13-2021

As environments involving both robots and humans become increasingly common, so does the need to account for people during planning. To plan effectively, robots must be able to respond to and sometimes influence what humans do. This requires a human model which predicts future human actions. A simple model may assume the human will continue what they did previously; a more complex one might predict that the human will act optimally, disregarding the robot; whereas an even more complex one might capture the robot's ability to influence the human. These models make different trade-offs between computational time and performance of the resulting robot plan. Using only one model of the human either wastes computational resources or is unable to handle critical situations. In this work, we give the robot access to a suite of human models and enable it to assess the performance-computation trade-off online. By estimating how an alternate model could improve human prediction and how that may translate to performance gain, the robot can dynamically switch human models whenever the additional computation is justified. Our experiments in a driving simulator showcase how the robot can achieve performance comparable to always using the best human model, but with greatly reduced computation.

artificial intelligence, ground transportation, robot, (19 more...)

arXiv.org Artificial Intelligence

2103.07815

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.46)

Add feedback

Feature Expansive Reward Learning: Rethinking Human Input

Bobu, Andreea, Wiggert, Marius, Tomlin, Claire, Dragan, Anca D.

arXiv.org Artificial IntelligenceJun-23-2020

In collaborative human-robot scenarios, when a person is not satisfied with how a robot performs a task, they can intervene to correct it. Reward learning methods enable the robot to adapt its reward function online based on such human input. However, this online adaptation requires low sample complexity algorithms which rely on simple functions of handcrafted features. In practice, pre-specifying an exhaustive set of features the person might care about is impossible; what should the robot do when the human correction cannot be explained by the features it already has access to? Recent progress in deep Inverse Reinforcement Learning (IRL) suggests that the robot could fall back on demonstrations: ask the human for demonstrations of the task, and recover a reward defined over not just the known features, but also the raw state space. Our insight is that rather than implicitly learning about the missing feature(s) from task demonstrations, the robot should instead ask for data that explicitly teaches it about what it is missing. We introduce a new type of human input, in which the person guides the robot from areas of the state space where the feature she is teaching is highly expressed to states where it is not. We propose an algorithm for learning the feature from the raw state space and integrating it into the reward function. By focusing the human input on the missing feature, our method decreases sample complexity and improves generalization of the learned reward over the above deep IRL baseline. We show this in experiments with a 7DOF robot manipulator. Finally, we discuss our method's potential implications for deep reward learning more broadly: taking a divide-and-conquer approach that focuses on important features separately before learning from demonstrations can improve generalization in tasks where such features are easy for the human to teach.

artificial intelligence, demonstration, survey article, (21 more...)

arXiv.org Artificial Intelligence

2006.13208

Country:

North America > United States > California (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)

Add feedback

Learning under Misspecified Objective Spaces

Bobu, Andreea, Bajcsy, Andrea, Fisac, Jaime F., Dragan, Anca D.

arXiv.org Artificial IntelligenceOct-11-2018

Learning robot objective functions from human input has become increasingly important, but state-of-the-art techniques assume that the human's desired objective lies within the robot's hypothesis space. When this is not true, even methods that keep track of uncertainty over the objective fail because they reason about which hypothesis might be correct, and not whether any of the hypotheses are correct. We focus specifically on learning from physical human corrections during the robot's task execution, where not having a rich enough hypothesis space leads to the robot updating its objective in ways that the person did not actually intend. We observe that such corrections appear irrelevant to the robot, because they are not the best way of achieving any of the candidate objectives. Instead of naively trusting and learning from every human interaction, we propose robots learn conservatively by reasoning in real time about how relevant the human's correction is for the robot's hypothesis space. We test our inference method in an experiment with human interaction data, and demonstrate that this alleviates unintended learning in an in-person user study with a 7DoF robot manipulator.

bayesian inference, optimization problem, robot, (18 more...)

arXiv.org Artificial Intelligence

1810.05157

Country:

North America > United States (0.46)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback