Algorithms in Reinforcement Learning
To learn the optimal action in unknown environment, Q-learning is the simple algorithm in reinforcement learning. Without having a model of an environment, it can learn the optimal and long-term action. And there have two policies called target policy and behavior policy. Tabular methods give correct policies and functions in tables. In Q-learning, to find optimal action value function, behavior policy can be achieved using policy iteration.
Nov-11-2020, 08:43:08 GMT