Mixtures of Experts Unlock Parameter Scaling for Deep RL

Obando-Ceron, Johan, Sokar, Ghada, Willi, Timon, Lyle, Clare, Farebrother, Jesse, Foerster, Jakob, Dziugaite, Gintare Karolina, Precup, Doina, Castro, Pablo Samuel

Feb-13-2024–arXiv.org Artificial Intelligence

The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model's performance scales proportionally to its size. Analogous scaling laws remain elusive for reinforcement learning domains, however, where increasing the parameter count of a model often hurts its final performance. In this paper, we demonstrate that incorporating Mixture-of-Expert (MoE) modules, and in particular Soft MoEs (Puigcerver et al., 2023), into value-based networks results in more parameter-scalable models, evidenced by substantial performance increases across a variety of training regimes and model sizes. This work thus provides strong empirical evidence towards developing scaling laws for reinforcement learning.

architecture, expert unlock parameter scaling, learning, (10 more...)

arXiv.org Artificial Intelligence

Feb-13-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Massachusetts
    - Middlesex County > Cambridge (0.04)
  - Canada > Quebec
    - Montreal (0.14)
- Europe
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
  - Romania > Sud - Muntenia Development Region
    - Giurgiu County > Giurgiu (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Japan > Honshū
    - Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Leisure & Entertainment > Games (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.46)