Feudal Reinforcement Learning by Reading Manuals
Wang, Kai, Wang, Zhonghao, Yu, Mo, Shi, Humphrey
–arXiv.org Artificial Intelligence
Reading to act is a prevalent but challenging task which requires the ability to reason from a concise instruction. However, previous works face the semantic mismatch between the low-level actions and the high-level language descriptions and require the human-designed curriculum to work properly. In this paper, we present a Feudal Reinforcement Learning (FRL) model consisting of a manager agent and a worker agent. The manager agent is a multi-hop plan generator dealing with high-level abstract information and generating a series of sub-goals in a backward manner. The worker agent deals with the low-level perceptions and actions to achieve the sub-goals one by one. In comparison, our FRL model effectively alleviate the mismatching between text-level inference and low-level perceptions and actions; and is general to various forms of environments, instructions and manuals; and our multi-hop plan generator can significantly boost for challenging tasks where multi-step reasoning form the texts is critical to resolve the instructed goals. We showcase our approach achieves competitive performance on two challenging tasks, Read to Fight Monsters (RTFM) and Messenger, without human-designed curriculum learning. Recently, there are increasing interests in building reinforcement learning (RL) agents that interact with humans via natural language, such as follow natural language instructions and complete goals specified in natural language. The successes of these studies will boost the user experience in a wide range of real-world applications, such as visual language navigation (Anderson et al., 2018; Wang et al., 2019b), interactive games (Gray et al., 2019), robot control (Tellex et al., 2020), goal-oriented dialog systems and other personal assistant applications (Dhingra et al., 2017). In order to generalize to real-world use cases, the research of RL with language instructions faces various kinds of complexity. One critical demand of these use cases is that humans tend to give concise instructions, which specify the goals they hope to achieve, instead of providing complete information for the intermediate steps.
arXiv.org Artificial Intelligence
Oct-12-2021
- Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Genre:
- Research Report (0.82)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.66)
- Technology: