Key Concepts of Modern Reinforcement Learning
As the Agent interacts with the Environment, it learns a policy. A policy is a "learned strategy" that governs the agents' behaviour in selecting an action at a particular time t of the Environment. A policy can be seen as a mapping from states of an Environment to the actions taken in those states. The goal of the reinforcement Agent is to maximize its long-term rewards as it interacts with the Environment in the feedback configuration. The response the Agent gets from each state-action cycle (where an Agent selects an action from a set of actions at each state of the Environment) is called the reward function. The reward function (or simply rewards) is a signal of the desirability of that state based on the action made by the Agent.
Mar-30-2020, 13:37:04 GMT
- Technology: