AITopics | sadigh

Collaborating Authors

sadigh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Listwise Reward Estimation for Offline Preference-based Reinforcement Learning

Choi, Heewoong, Jung, Sangwon, Ahn, Hongjoon, Moon, Taesup

arXiv.org Artificial IntelligenceAug-7-2024

In Reinforcement Learning (RL), designing precise reward functions remains to be a challenge, particularly when aligning with human intent. Preference-based RL (PbRL) was introduced to address this problem by learning reward models from human feedback. However, existing PbRL methods have limitations as they often overlook the second-order preference that indicates the relative strength of preference. In this paper, we propose Listwise Reward Estimation (LiRE), a novel approach for offline PbRL that leverages second-order preference information by constructing a Ranked List of Trajectories (RLT), which can be efficiently built by using the same ternary feedback type as traditional methods. To validate the effectiveness of LiRE, we propose a new offline PbRL dataset that objectively reflects the effect of the estimated rewards. Our extensive experiments on the dataset demonstrate the superiority of LiRE, i.e., outperforming state-of-the-art baselines even with modest feedback budgets and enjoying robustness with respect to the number of feedbacks and feedback noise. Our code is available at https://github.com/chwoong/LiRE

learning, preference feedback, reward model, (14 more...)

arXiv.org Artificial Intelligence

2408.0419

Country:

Europe > Austria > Vienna (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Generalized Acquisition Function for Preference-based Reward Learning

Ellis, Evan, Ghosal, Gaurav R., Russell, Stuart J., Dragan, Anca, Bıyık, Erdem

arXiv.org Artificial IntelligenceMar-9-2024

Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize information gain about the reward function parameters improves data efficiency. The information gain criterion focuses on precisely identifying all parameters of the reward function. This can potentially be wasteful as many parameters may result in the same reward, and many rewards may result in the same behavior in the downstream tasks. Instead, we show that it is possible to optimize for learning the reward function up to a behavioral equivalence class, such as inducing the same ranking over behaviors, distribution over choices, or other related definitions of what makes two rewards similar. We introduce a tractable framework that can capture such definitions of similarity. Our experiments in a synthetic environment, an assistive robotics environment with domain transfer, and a natural language processing problem with real datasets demonstrate the superior performance of our querying method over the state-of-the-art information gain method.

learning, reward function, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2403.06003

Country: North America > United States > California (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Active Inverse Learning in Stackelberg Trajectory Games

Yu, Yue, Levy, Jacob, Mehr, Negar, Fridovich-Keil, David, Topcu, Ufuk

arXiv.org Artificial IntelligenceAug-15-2023

Game-theoretic inverse learning is the problem of inferring the players' objectives from their actions. We formulate an inverse learning problem in a Stackelberg game between a leader and a follower, where each player's action is the trajectory of a dynamical system. We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates describes the follower's objective function. Instead of using passively observed trajectories like existing methods, the proposed method actively maximizes the differences in the follower's trajectories under different hypotheses to accelerate the leader's inference. We demonstrate the proposed method in a receding-horizon repeated trajectory game. Compared with uniformly random inputs, the leader inputs provided by the proposed method accelerate the convergence of the probability of different hypotheses conditioned on the follower's trajectory by orders of magnitude.

artificial intelligence, follower, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2308.08017

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre: Research Report (0.50)

Industry:

Transportation (0.46)
Education (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Active Reward Learning from Online Preferences

Myers, Vivek, Bıyık, Erdem, Sadigh, Dorsa

arXiv.org Artificial IntelligenceFeb-26-2023

Robot policies need to adapt to human preferences and/or new environments. Human experts may have the domain knowledge required to help robots achieve this adaptation. However, existing works often require costly offline re-training on human feedback, and those feedback usually need to be frequent and too complex for the humans to reliably provide. To avoid placing undue burden on human experts and allow quick adaptation in critical real-world situations, we propose designing and sparingly presenting easy-to-answer pairwise action preference queries in an online fashion. Our approach designs queries and determines when to present them to maximize the expected value derived from the queries' information. We demonstrate our approach with experiments in simulation, human user studies, and real robot experiments. In these settings, our approach outperforms baseline techniques while presenting fewer queries to human experts. Experiment videos, code and appendices are found at https://sites.google.com/view/onlineactivepreferences.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2302.13507

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Learning Multimodal Rewards from Rankings

Myers, Vivek, Bıyık, Erdem, Anari, Nima, Sadigh, Dorsa

arXiv.org Artificial IntelligenceSep-26-2021

Learning from human feedback has shown to be a useful approach in acquiring robot reward functions. However, expert feedback is often assumed to be drawn from an underlying unimodal reward function. This assumption does not always hold including in settings where multiple experts provide data or when a single expert provides data for different tasks -- we thus go beyond learning a unimodal reward and focus on learning a multimodal reward function. We formulate the multimodal reward learning as a mixture learning problem and develop a novel ranking-based learning approach, where the experts are only required to rank a given set of trajectories. Furthermore, as access to interaction data is often expensive in robotics, we develop an active querying approach to accelerate the learning process. We conduct experiments and user studies using a multi-task variant of OpenAI's LunarLander and a real Fetch robot, where we collect data from multiple users with different preferences. The results suggest that our approach can efficiently learn multimodal reward functions, and improve data-efficiency over benchmark methods that we adapt to our learning problem.

query, reward function, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2109.1275

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry: Education > Focused Education > Special Education (0.44)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

APReL: A Library for Active Preference-based Reward Learning Algorithms

Bıyık, Erdem, Talati, Aditi, Sadigh, Dorsa

arXiv.org Artificial IntelligenceAug-16-2021

Reward learning is a fundamental problem in robotics to have robots that operate in alignment with what their human user wants. Many preference-based learning algorithms and active querying techniques have been proposed as a solution to this problem. In this paper, we present APReL, a library for active preference-based reward learning algorithms, which enable researchers and practitioners to experiment with the existing techniques and easily develop their own algorithms for various modules of the problem.

demonstration, query, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2108.07259

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

The key to smarter robot collaborators may be more simplicity

#artificialintelligenceNov-13-2020, 17:50:10 GMT

Think of all the subconscious processes you perform while you're driving. As you take in information about the surrounding vehicles, you're anticipating how they might move and thinking on the fly about how you'd respond to those maneuvers. You may even be thinking about how you might influence the other drivers based on what they think you might do. If robots are to integrate seamlessly into our world, they'll have to do the same. Now researchers from Stanford University and Virginia Tech have proposed a new technique to help robots perform this kind of behavioral modeling, which they will present at the annual international Conference on Robot Learning next week.

collaborator, interaction, robot, (14 more...)

#artificialintelligence

Country:

North America > United States > Virginia (0.25)
North America > Canada > Ontario > Toronto (0.16)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Artificial Intelligence Will Do What We Ask. That's a Problem. Quanta Magazine

#artificialintelligenceJan-30-2020, 20:08:24 GMT

The danger of having artificially intelligent machines do our bidding is that we might not be careful enough about what we wish for. The lines of code that animate these machines will inevitably lack nuance, forget to spell out caveats, and end up giving AI systems goals and incentives that don't align with our true preferences. A now-classic thought experiment illustrating this problem was posed by the Oxford philosopher Nick Bostrom in 2003. Bostrom imagined a superintelligent robot, programmed with the seemingly innocuous goal of manufacturing paper clips. The robot eventually turns the whole world into a giant paper clip factory. Such a scenario can be dismissed as academic, a worry that might arise in some far-off future.

algorithm, quanta magazine, youtube, (9 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.06)
North America > United States > California > Alameda County > Berkeley (0.06)

Genre: Research Report (0.51)

Industry:

Government > Voting & Elections (0.51)
Transportation > Passenger (0.33)
Government > Regional Government > North America Government > United States Government (0.32)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.56)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.36)

Add feedback

WATCH: Self-Driving Cars Need To Learn How Humans Drive

NPR TechnologyAug-21-2018, 09:25:55 GMT

One researcher is putting real humans into computerized driving simulations to help self-driving cars learn human behavior. In the not-too-distant future, Americans will be sharing the road with self-driving cars. Companies are pouring billions of dollars into developing self-driving vehicles. Waymo, formerly the Google self-driving-car project, says that its self-driving cars have already driven millions of miles on the open road. Stanford University assistant professor Dorsa Sadigh has ridden in self-driving cars.

artificial intelligence, sadigh, self-driving car, (3 more...)

NPR Technology

AI-Alerts: 2018 > 2018-08 > AAAI AI-Alert for Aug 21, 2018 (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback