AITopics | meta-inverse reinforcement learning

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Neural Information Processing SystemsDec-25-2025, 05:01:05 GMT

Reinforcement learning demands a reward function, which is often difficult to provide or design in real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, and more subtly, existing methods typically assume demonstrations for one, isolated behavior or task, while in practice, it is significantly more natural and scalable to provide datasets of heterogeneous behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from unstructured, multi-task demonstration data, and critically, use this experience to infer robust rewards for new, structurally-similar tasks from a single demonstration. Our experiments on multiple continuous control tasks demonstrate the effectiveness of our approach compared to state-of-the-art imitation and inverse reinforcement learning methods.

demonstration, meta-inverse reinforcement learning, probabilistic context variable, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Neural Information Processing SystemsJan-22-2025, 16:45:18 GMT

The paper identifies the unsolved problem of meta-Inverse Reinforcement Learning. That is, learning a reward function for an unseen task from a single expert trajectory for that task, using a batch of expert trajectories for different but related tasks as training data (the task being solved by each training expert trajectory is not communicated to the learning algorithm). Because IRL is used rather than imitation learning, a reward function is learned for each task (or rather a single reward function parameterized by the latent variable m which is supposed to capture task). The paper then formulates an framework for training neural networks to solve the identified problem, building off of past work on Adversarial IRL, and adding latent task variables to handle the variation in task. A network q_psi is used to identify the task variable from a demonstration.

expert trajectory, reinforcement learning, reward function, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Reviews: Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Neural Information Processing SystemsJan-22-2025, 16:45:06 GMT

The reviewers agree that the paper is interesting and a good contribution.

meta-inverse reinforcement learning, probabilistic context variable

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Neural Information Processing SystemsOct-9-2024, 18:56:08 GMT

Reinforcement learning demands a reward function, which is often difficult to provide or design in real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, and more subtly, existing methods typically assume demonstrations for one, isolated behavior or task, while in practice, it is significantly more natural and scalable to provide datasets of heterogeneous behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from unstructured, multi-task demonstration data, and critically, use this experience to infer robust rewards for new, structurally-similar tasks from a single demonstration.

demonstration, meta-inverse reinforcement learning, probabilistic context variable, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Yu, Lantao, Yu, Tianhe, Finn, Chelsea, Ermon, Stefano

Neural Information Processing SystemsMar-19-2020, 01:30:45 GMT

Reinforcement learning demands a reward function, which is often difficult to provide or design in real world applications. While inverse reinforcement learning (IRL) holds promise for automatically learning reward functions from demonstrations, several major challenges remain. First, existing IRL methods learn reward functions from scratch, requiring large numbers of demonstrations to correctly infer the reward for each task the agent may need to perform. Second, and more subtly, existing methods typically assume demonstrations for one, isolated behavior or task, while in practice, it is significantly more natural and scalable to provide datasets of heterogeneous behaviors. To this end, we propose a deep latent variable model that is capable of learning rewards from unstructured, multi-task demonstration data, and critically, use this experience to infer robust rewards for new, structurally-similar tasks from a single demonstration.

demonstration, meta-inverse reinforcement learning, probabilistic context variable, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

meta-inverse reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Reviews: Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Reviews: Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables