Introduction to Various Reinforcement Learning Algorithms. Part II (TRPO, PPO)

@machinelearnbot 

Advantage is a term that is commonly used in numerous advanced RL algorithms, such as A3C, NAF, and the algorithms that I am going to discuss (perhaps I will write another blog post for these two algorithms). To view it in a more intuitive manner, think of it as how good an action is compared to the average action for a specific state. But why do we need advantage? I will use an example posted in this forum to illustrate the idea of advantage. Have you ever played a game called "Catch"? In the game, fruits will be dropping down from the top of the screen.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found