Dealing with Sparse Rewards Using Graph Neural Networks

Gerasyov, Matvey, Makarov, Ilya

arXiv.org Artificial Intelligence 

Reinforcement learning is a machine learning paradigm where an artificial agent learns the optimal behavior through interactions with a dynamic environment. Goals and purposes are explained to the agent via a scalar reward signal it receives after each interaction. Throughout the training process, the agent infers the behavior that maximizes cumulative reward, also called the return. To succeed in this task, the agent needs to explore the environment to understand which states and actions yield high rewards. On the other hand, the agent also has to exploit the rewards it has already received to adapt its behavior. This problem is known as the exploration and exploitation trade-off. This work was supported in part on Section 2 by the Strategic Project "Digital Business" within the framework of the Strategic Academic Leadership Program "Priority 2030" at the National University of Science and Technology (NUST) MISiS, in part by the Basic Research Program at the National Research University Higher School of Economics (HSE University), and in part by the Computational Resources of HPC Facilities at HSE University.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found