When should agents explore?

Pîslar, Miruna, Szepesvari, David, Ostrovski, Georg, Borsa, Diana, Schaul, Tom

Aug-26-2021–arXiv.org Artificial Intelligence

Exploration remains a central challenge for reinforcement learning (RL). Virtually all existing methods share the feature of a monolithic behaviour policy that changes only gradually (at best). In contrast, the exploratory behaviours of animals and humans exhibit a rich diversity, namely including forms of switching between modes. This paper presents an initial study of mode-switching, non-monolithic exploration for RL. We investigate different modes to switch between, at what timescales it makes sense to switch, and what signals make for good switching triggers. We also propose practical algorithmic components that make the switching mechanism adaptive and robust, which enables flexibility without an accompanying hyper-parameter-tuning burden. Finally, we report a promising and detailed analysis on Atari, using two-mode exploration and switching at sub-episodic time-scales.

exploration, explore mode, xu-intra, (15 more...)

arXiv.org Artificial Intelligence

Aug-26-2021

arXiv.org PDF

Add feedback

Country:
- North America
  - Puerto Rico (0.04)
  - United States > New York
    - New York County > New York City (0.04)
- Europe > United Kingdom
  - England
    - Greater London > London (0.04)
    - Cambridgeshire > Cambridge (0.04)
- Asia > Japan
  - Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Games (0.93)
- Health & Medicine > Therapeutic Area
  - Neurology (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)