AITopics | plangan

6101903146e4bbf4999c449d78441606-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 15:24:29 GMT

algorithm, ensemble, trajectory, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Montana (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Neural Information Processing SystemsDec-24-2025, 02:42:36 GMT

Learning with sparse rewards remains a significant challenge in reinforcement learning (RL), especially when the aim is to train a policy capable of achieving multiple different goals. To date, the most successful approaches for dealing with multi-goal, sparse reward environments have been model-free RL algorithms. In this work we propose PlanGAN, a model-based algorithm specifically designed for solving multi-goal tasks in environments with sparse rewards. Our method builds on the fact that any trajectory of experience collected by an agent contains useful information about how to achieve the goals observed during that trajectory. We use this to train an ensemble of conditional generative models (GANs) to generate plausible trajectories that lead the agent from its current state towards a specified goal. We then combine these imagined trajectories into a novel planning algorithm in order to achieve the desired goal as efficiently as possible. The performance of PlanGAN has been tested on a number of robotic navigation/manipulation tasks in comparison with a range of model-free reinforcement learning baselines, including Hindsight Experience Replay. Our studies indicate that PlanGAN can achieve comparable performance whilst being around 4-8 times more sample efficient.

model-based planning, plangan, sparse reward and multiple goal, (6 more...)

Neural Information Processing Systems

Country: North America > United States > Montana (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

6101903146e4bbf4999c449d78441606-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 01:32:14 GMT

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Review for NeurIPS paper: PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Neural Information Processing SystemsJan-24-2025, 23:06:27 GMT

This paper proposes using an ensemble of GANs to learn a goal-conditioned forward model of trajectories for use in planning. The model is trained using a variant of hindsight experience replay, resulting in an agent that can succeed at sparse goal-conditioned tasks with much better data efficiency than model-free approaches. All reviewers highlighted the impressiveness of the experimental results, with R1 and R2 finding the approach very interesting, and R3 and R4 indicating the potential impact and interest this work will have. I agree that this paper will likely be of broad interest to the RL community at NeurIPS and therefore recommend acceptance. However, several reviewers also noted the lack of comparison to other model-based approaches.

model-based planning, plangan, sparse reward and multiple goal, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.61)

Add feedback

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Neural Information Processing SystemsOct-10-2024, 08:26:44 GMT

Learning with sparse rewards remains a significant challenge in reinforcement learning (RL), especially when the aim is to train a policy capable of achieving multiple different goals. To date, the most successful approaches for dealing with multi-goal, sparse reward environments have been model-free RL algorithms. In this work we propose PlanGAN, a model-based algorithm specifically designed for solving multi-goal tasks in environments with sparse rewards. Our method builds on the fact that any trajectory of experience collected by an agent contains useful information about how to achieve the goals observed during that trajectory. We use this to train an ensemble of conditional generative models (GANs) to generate plausible trajectories that lead the agent from its current state towards a specified goal.

plangan, sparse reward and multiple goal, trajectory, (3 more...)

Neural Information Processing Systems

Country: North America > United States > Montana (0.09)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Charlesworth, Henry, Montana, Giovanni

arXiv.org Artificial IntelligenceJun-1-2020

Learning with sparse rewards remains a significant challenge in reinforcement learning (RL), especially when the aim is to train a policy capable of achieving multiple different goals. To date, the most successful approaches for dealing with multi-goal, sparse reward environments have been model-free RL algorithms. In this work we propose PlanGAN, a model-based algorithm specifically designed for solving multi-goal tasks in environments with sparse rewards. Our method builds on the fact that any trajectory of experience collected by an agent contains useful information about how to achieve the goals observed during that trajectory. We use this to train an ensemble of conditional generative models (GANs) to generate plausible trajectories that lead the agent from its current state towards a specified goal. We then combine these imagined trajectories into a novel planning algorithm in order to achieve the desired goal as efficiently as possible. The performance of PlanGAN has been tested on a number of robotic navigation/manipulation tasks in comparison with a range of model-free reinforcement learning baselines, including Hindsight Experience Replay. Our studies indicate that PlanGAN can achieve comparable performance whilst being around 4-8 times more sample efficient.

machine learning, reinforcement learning, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2006.009

Country: