Reinforcement Learning in POMDPs With Memoryless Options and Option-Observation Initiation Sets

Open in new window