AITopics | irl model

Collaborating Authors

irl model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Inverse Reinforcement Learning for Minimum-Exposure Paths in Spatiotemporally Varying Scalar Fields

Ballentine, Alexandra E., Cowlagi, Raghvendra V.

arXiv.org Artificial IntelligenceMar-9-2025

Performance and reliability analyses of autonomous vehicles (AVs) can benefit from tools that ``amplify'' small datasets to synthesize larger volumes of plausible samples of the AV's behavior. We consider a specific instance of this data synthesis problem that addresses minimizing the AV's exposure to adverse environmental conditions during travel to a fixed goal location. The environment is characterized by a threat field, which is a strictly positive scalar field with higher intensities corresponding to hazardous and unfavorable conditions for the AV. We address the problem of synthesizing datasets of minimum exposure paths that resemble a training dataset of such paths. The main contribution of this paper is an inverse reinforcement learning (IRL) model to solve this problem. We consider time-invariant (static) as well as time-varying (dynamic) threat fields. We find that the proposed IRL model provides excellent performance in synthesizing paths from initial conditions not seen in the training dataset, when the threat field is the same as that used for training. Furthermore, we evaluate model performance on unseen threat fields and find low error in that case as well. Finally, we demonstrate the model's ability to synthesize distinct datasets when trained on different datasets with distinct characteristics.

dataset, reward function, threat field, (15 more...)

arXiv.org Artificial Intelligence

2503.06611

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Worcester County > Worcester (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On complementing end-to-end human motion predictors with planning

Sun, Liting, Jia, Xiaogang, Dragan, Anca D.

arXiv.org Artificial IntelligenceMar-9-2021

High capacity end-to-end approaches for human motion prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events. Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions: it is much more stable in the face of distribution shift, but it has high inductive bias, missing important aspects that drive human decisions, and ignoring cognitive biases that make human behavior suboptimal. In this work, we analyze one family of approaches that strive to get the best of both worlds: use the end-to-end predictor on common cases, but do not rely on it for tail events / out-of-distribution inputs -- switch to the planning-based predictor there. We contribute an analysis of different approaches for detecting when to make this switch, using an autonomous driving domain. We find that promising approaches based on ensembling or generative modeling of the training distribution might not be reliable, but that there very simple methods which can perform surprisingly well -- including training a classifier to pick up on tell-tale issues in predicted trajectories.

ade, prediction, predictor, (17 more...)

arXiv.org Artificial Intelligence

2103.05661

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Ground > Road (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.66)

Add feedback

Generalized Maximum Causal Entropy for Inverse Reinforcement Learning

Mai, Tien, Chan, Kennard, Jaillet, Patrick

arXiv.org Machine LearningNov-15-2019

We consider the problem of learning from demonstrated trajectories with inverse reinforcement learning (IRL). Motivated by a limitation of the classical maximum entropy model (Ziebart, Bagnell, and Dey 2010) in capturing the structure of the network of states, we propose an IRL model based on a generalized version of the causal entropy maximization problem, which allows us to generate a class of maximum entropy IRL models. Our generalized model has an advantage of being able to recover, in addition to a reward function, another expert's function that would (partially) capture the impact of the connecting structure of the states on experts' decisions. Empirical evaluation on a real-world dataset and a grid-world dataset shows that our generalized model outperforms the classical ones, in terms of recovering reward functions and demonstrated trajectories.

irl model, reward function, trajectory, (12 more...)

arXiv.org Machine Learning

1911.06928

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Inverse Reinforcement Learning with Missing Data

Mai, Tien, Nguyen, Quoc Phong, Low, Kian Hsiang, Jaillet, Patrick

arXiv.org Artificial IntelligenceNov-15-2019

We consider the problem of recovering an expert's reward function with inverse reinforcement learning (IRL) when there are missing/incomplete state-action pairs or observations in the demonstrated trajectories. This issue of missing trajectory data or information occurs in many situations, e.g., GPS signals from vehicles moving on a road network are intermittent. In this paper, we propose a tractable approach to directly compute the log-likelihood of demonstrated trajectories with incomplete/missing data. Our algorithm is efficient in handling a large number of missing segments in the demonstrated trajectories, as it performs the training with incomplete data by solving a sequence of systems of linear equations, and the number of such systems to be solved does not depend on the number of missing segments. Empirical evaluation on a real-world dataset shows that our training algorithm outperforms other conventional techniques.

algorithm, linear equation, trajectory, (15 more...)

arXiv.org Artificial Intelligence

1911.0693

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Ground > Road (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback