noise injection and information bottleneck
Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
The ability for policies to generalize to new environments is key to the broad application of RL agents. A promising approach to prevent an agent's policy from overfitting to a limited set of training environments is to apply regularization techniques originally developed for supervised learning. However, there are stark differences between supervised learning and RL. We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL. In particular, we focus on regularization techniques relying on the injection of noise into the learned function, a family that includes some of the most widely used approaches such as Dropout and Batch Normalization. To adapt them to RL, we propose Selective Noise Injection (SNI), which maintains the regularizing effect the injected noise has, while mitigating the adverse effects it has on the gradient quality. Furthermore, we demonstrate that the Information Bottleneck (IB) is a particularly well suited regularization technique for RL as it is effective in the low-data regime encountered early on in training RL agents. Combining the IB with SNI, we significantly outperform current state of the art results, including on the recently proposed generalization benchmark Coinrun.
Reviews: Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
This work builds on the previous work about generalization in RL ([10] in the paper references) by (re-)investigating the classical stochastic regularization approaches in this context. It completes and updates the claims made in [10] by focusing of similar performance based experiments. Clarity: The method is clearly described in the paper. Significance: The question of generalization in RL is of great interest to the field. Main comments: - The paper motivates well the problems one faces when is comes to regularization in RL.
Reviews: Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
According to the reviews, this submission is quite easy to evaluate. All reviewers view the paper as presenting a novel and promising technique for regularization via noise injection along with variational information bottleneck. Performance benefits are also shown by state-of-art performance in the CoinRunner domain. Reviewers also found the author feedback quite convincing, as two of the three reviewers raised their overall scores. There were only a few issues mentioned in the revised reviews, and these issues were considered as minor.
Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
The ability for policies to generalize to new environments is key to the broad application of RL agents. A promising approach to prevent an agent's policy from overfitting to a limited set of training environments is to apply regularization techniques originally developed for supervised learning. However, there are stark differences between supervised learning and RL. We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL. In particular, we focus on regularization techniques relying on the injection of noise into the learned function, a family that includes some of the most widely used approaches such as Dropout and Batch Normalization.
Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
Igl, Maximilian, Ciosek, Kamil, Li, Yingzhen, Tschiatschek, Sebastian, Zhang, Cheng, Devlin, Sam, Hofmann, Katja
The ability for policies to generalize to new environments is key to the broad application of RL agents. A promising approach to prevent an agent's policy from overfitting to a limited set of training environments is to apply regularization techniques originally developed for supervised learning. However, there are stark differences between supervised learning and RL. We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL. In particular, we focus on regularization techniques relying on the injection of noise into the learned function, a family that includes some of the most widely used approaches such as Dropout and Batch Normalization.