C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory
–Neural Information Processing Systems
Generative Adversarial Imitation Learning (GAIL) provides a promising approach to training a generative policy to imitate a demonstrator. It uses on-policy Reinforcement Learning (RL) to optimize a reward signal derived from an adversarial discriminator. However, optimizing GAIL is difficult in practise, with the training loss oscillating during training, slowing convergence. This optimization instability can prevent GAIL from finding a good policy, harming its final performance. In this paper, we study GAIL's optimization from a control-theoretic perspective.
Neural Information Processing Systems
May-26-2025, 20:57:18 GMT
- Technology: