Explainability in Deep Reinforcement Learning
Heuillet, Alexandre, Couthouis, Fabien, Díaz-Rodríguez, Natalia
–arXiv.org Artificial Intelligence
During the past decade, Artificial Intelligence (AI), and by extension Machine Learning (ML), have seen an unprecedented rise in both industry and research. The progressive improvement of computer hardware associated with the need to process larger and larger amounts of data made these underestimated techniques shine under a new light. Reinforcement Learning (RL) focuses on learning how to map situations to actions, in order to maximize a numerical reward signal [102]. The learner is not told which actions to take, but instead must discover which actions are the most rewarding by trying them. Reinforcement learning addresses the problem of how agents should learn a policy that take actions to maximize the cumulative reward through interaction with the environment [31]. Recent progress in Deep Learning (DL) for learning feature representations has significantly impacted RL, and the combination of both methods (known as deep RL) has led to remarkable results in a lot of areas. Typically, RL is used to solve optimisation problems when the system has a very large number of states and has a complex stochastic structure. Notable examples include training agents to play Atari games based on raw pixels [75, 76], board games [96, 97], complex real-world robotics problems such as manipulation [8] or grasping [54] and other real-world applications such as resource management in computer clusters [72], network traffic signal control [9], chemical reactions optimization [117] or recommendation systems [116].
arXiv.org Artificial Intelligence
Aug-20-2020
- Country:
- North America > United States (0.14)
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Europe
- France (0.04)
- Germany > Hesse
- Darmstadt Region > Darmstadt (0.04)
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.86)
- Technology: