Runtime Verification of Learning Properties for Reinforcement Learning Algorithms

Mannucci, Tommaso, Filho, Julio de Oliveira

arXiv.org Artificial Intelligence 

Reinforcement learning (RL) [16] is a bio-inspired approach to machine learning which formalizes the notion of "trial-and-error" and "learn-by-doing". RL enables systems to learn during operation based on sequential interactions with the environment. During their learning phase, RL algorithms encourage decisions that led to good results in the past while avoiding detrimental choices. This simple concept is at the base of some stunning results in robotics automation [14], natural language processing [7], and computerised gaming, such as the Atari [10], StarCraft [19] video games, and the ancient tabletop games of Chess, Shogi, and Go [13]. Due to the runtime and interactive nature of RL algorithms, there has been an increasing demand for guarantees about their learning; e.g., that it will be concluded within a certain amount of time or interactions when done in an operational environment.