Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

Neural Information Processing Systems 

In many sequential decision making settings, the agent lacks complete information about the underlying state of the system, a phenomenon known as partial observability .

Similar Docs  Excel Report  more

TitleSimilaritySource
None found