When Is Partially Observable Reinforcement Learning Not Scary?
Liu, Qinghua, Chung, Alan, Szepesvári, Csaba, Jin, Chi
A wide range of modern artificial intelligence challenges ca n be cast as Reinforcement Learning (RL) problems under partial observability, in which agents learn to make a sequence of decisions despite lacking complete information about the underlying state of system. For example, in robotics the agent has to cope with noisy sensors, occlusions, and unk nown dynamics ( Akkaya et al., 2019), while in imperfect information games the player makes only l ocal observations ( Vinyals et al., 2019; Brown and Sandholm, 2019). Further applications of partially observable RL include autonomous driving ( Levinson et al., 2011), resource allocation ( Bower and Gilbert, 2005), medical diagnostic systems ( Hauskrecht and Fraser, 2000), recommendation ( Li et al., 2010), business management ( De Brito and Van Der Laan, 2009), etc. As such, learning and acting under partial observabi lity has been an important topic in operation research, control, and machine learning. Because of the non-Markovian nature of the observations, le arning and planning in partially observable environments requires an agent to maintain memory and possibly reason about beliefs over the states, all while exploring to collect information about the environment. As such, partial observability can significantly complicate learni ng and planning under uncertainty. While practical RL systems have succeeded in a set of partially obs ervable problems including Poker ( Brown and Sandholm, 2019), Starcraft ( Vinyals et al., 2019) and certain robotic tasks ( Cassandra et al., The author emails are {qinghual, alan.chung,
May-24-2022
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Health & Medicine > Diagnostic Medicine (0.54)
- Leisure & Entertainment > Games (0.88)
- Technology: