Learning in Observable POMDPs, without Computationally Intractable Oracles

Neural Information Processing Systems 

Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found