MindForge: Empowering Embodied Agents with Theory of Mind for Lifelong Collaborative Learning

Lică, Mircea, Shirekar, Ojas, Colle, Baptiste, Raman, Chirag

arXiv.org Artificial Intelligence 

Contemporary embodied agents, such as Voyager in Minecraft, have demonstrated promising capabilities in open-ended individual learning. However, when powered with open large language models (LLMs), these agents often struggle with rudimentary tasks, even when fine-tuned on domain-specific knowledge. These advancements enable agents to reason about their and others' mental states, empirically addressing two prevalent failure modes: false beliefs and faulty task executions. The development of generally capable agents marks a significant shift in advancing artificial intelligence, transitioning from assimilating data to generating novel knowledge through embodied interactions with open-ended environments (Kolve et al., 2017; Savva et al., 2019; Puig et al., 2018; Shridhar et al., 2020). Classical approaches leveraging reinforcement learning (Schulman et al., 2017; Hafner et al., 2023) and imitation learning (Zare et al., 2024) often struggle with generalization and exploration, as agents tend to converge on repetitive behaviors in static environments (Cobbe et al., 2019). To address these limitations, researchers have sought to emulate human-like lifelong learning capabilities, developing systems that can continuously acquire, update, and transfer knowledge over extended periods (Parisi et al., 2019; Wang et al., 2023b).The advent of large language models (LLMs) has accelerated this pursuit, enabling the development of agents such as Voyager (Wang et al., 2023a) that can apply internet-scale knowledge to continuously explore, plan, and acquire new skills in partially observable, open-ended environments such as Minecraft. Despite their promise, we argue that state-of-the-art lifelong learning agents like Voyager face a crucial limitation: they learn in isolation, neglecting a fundamental aspect of human intelligence--the social context. So central is the social context to our existence, that the Social Intelligence Hypothesis posits that our cognitive capabilities evolved primarily to navigate the complexities of social life (Humphrey, 1976; Dunbar, 1998). This isolated learning becomes particularly problematic when coupled with these agents' reliance on closed LLM) like GPT-4. Wang et al. (2023a) note that "VOYAGER requires Hey! I need help with Sure!

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found