RL-DARTS: Differentiable Architecture Search for Reinforcement Learning
Miao, Yingjie, Song, Xingyou, Peng, Daiyi, Yue, Summer, Brevdo, Eugene, Faust, Aleksandra
–arXiv.org Artificial Intelligence
We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL) to search for convolutional cells, applied to the Procgen benchmark. We outline the initial difficulties of applying neural architecture search techniques in RL, and demonstrate that by simply replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code. Surprisingly, we find that the supernet can be used as an actor for inference to generate replay data in standard RL training loops, and thus train end-to-end. Throughout this training process, we show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
arXiv.org Artificial Intelligence
Jun-3-2021
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- United States
- Washington > King County
- Seattle (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- New York
- Richmond County > New York City (0.04)
- Queens County > New York City (0.04)
- New York County > New York City (0.04)
- Kings County > New York City (0.04)
- Bronx County > New York City (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Colorado > Denver County
- Denver (0.04)
- California > Los Angeles County
- Long Beach (0.14)
- Washington > King County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada > British Columbia
- United States
- Europe
- Czechia > Prague (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- Sweden > Stockholm
- Stockholm (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- France > Hauts-de-France
- Asia
- South Korea > Seoul
- Seoul (0.04)
- Middle East
- Jordan (0.04)
- Israel
- Tel Aviv District > Tel Aviv (0.04)
- Haifa District > Haifa (0.04)
- South Korea > Seoul
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Oceania > Australia
- Genre:
- Research Report (0.64)
- Technology: