AITopics | minimaxregret

Collaborating Authors

minimaxregret

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment

Neural Information Processing SystemsFeb-10-2026, 17:44:12 GMT

However, the inferred reward function often fails to capture the underlying task objective.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

30df5ecdd245bd2f4b0c5ba48de9674a-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:34:28 GMT

irl, minimaxregret, reward function, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment

Zhou, Weichao, Li, Wenchao

arXiv.org Artificial IntelligenceOct-31-2024

Many imitation learning (IL) algorithms use inverse reinforcement learning (IRL) to infer a reward function that aligns with the demonstrations. However, the inferred reward function often fails to capture the underlying task objective. In this paper, we propose a novel framework for IRL-based IL that prioritizes task alignment over conventional data alignment. Our framework is a semi-supervised approach that leverages expert demonstrations as weak supervision signals to derive a set of candidate reward functions that align with the task rather than only with the data. It adopts an adversarial mechanism to train a policy with this set of reward functions to gain a collective validation of the policy's ability to accomplish the task. We provide theoretical insights into this framework's ability to mitigate task-reward misalignment and present a practical implementation. Our experimental results show that our framework outperforms conventional IL baselines in complex and transfer learning scenarios. The complete code are available at https://github.com/zwc662/PAGAR.

demo, minimaxregret, reward function, (14 more...)

arXiv.org Artificial Intelligence

2410.2368

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

PAGAR: Taming Reward Misalignment in Inverse Reinforcement Learning-Based Imitation Learning with Protagonist Antagonist Guided Adversarial Reward

Zhou, Weichao, Li, Wenchao

arXiv.org Artificial IntelligenceFeb-7-2024

Many imitation learning (IL) algorithms employ inverse reinforcement learning (IRL) to infer the intrinsic reward function that an expert is implicitly optimizing for based on their demonstrated behaviors. However, in practice, IRL-based IL can fail to accomplish the underlying task due to a misalignment between the inferred reward and the objective of the task. In this paper, we address the susceptibility of IL to such misalignment by introducing a semi-supervised reward design paradigm called Protagonist Antagonist Guided Adversarial Reward (PAGAR). PAGAR-based IL trains a policy to perform well under mixed reward functions instead of a single reward function as in IRL-based IL. We identify the theoretical conditions under which PAGAR-based IL can avoid the task failures caused by reward misalignment. We also present a practical on-and-off policy approach to implementing PAGAR-based IL. Experimental results show that our algorithm outperforms standard IL baselines in complex tasks and challenging transfer settings.

minimaxregret, reward function, title suppressed, (15 more...)

arXiv.org Artificial Intelligence

2306.01731

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback