Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Oct-9-2024, 17:21:34 GMT–Neural Information Processing Systems

While agents trained by Reinforcement Learning (RL) can solve increasingly challenging tasks directly from visual observations, generalizing learned skills to novel environments remains very challenging. Extensive use of data augmentation is a promising technique for improving generalization in RL, but it is often found to decrease sample efficiency and can even lead to divergence. In this paper, we investigate causes of instability when using data augmentation in common off-policy RL algorithms. We identify two problems, both rooted in high-variance Q-targets. Based on our findings, we propose a simple yet effective technique for stabilizing this class of algorithms under augmentation.

augmentation, convnet and vision transformer, data augmentation, (4 more...)

Neural Information Processing Systems

Oct-9-2024, 17:21:34 GMT

Conferences Web Page

Add feedback

Genre:
- Research Report > Promising Solution (0.43)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)