Finding good policies in average-reward Markov Decision Processes without prior knowledge

Neural Information Processing Systems 

Reinforcement learning (RL) is a paradigm in which an agent interacts with its environment, modeled as a Markov Decision Process (MDP), by taking actions and observing rewards.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found