Sub-goal Distillation: A Method to Improve Small Language Agents
Hashemzadeh, Maryam, Stengel-Eskin, Elias, Chandar, Sarath, Cote, Marc-Alexandre
–arXiv.org Artificial Intelligence
While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational requirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks. To address these constraints, we propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Our approach involves constructing a hierarchical agent comprising a planning module, which learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module, which learns to accomplish these sub-goals using elementary actions. Subsequently, we utilize this annotated data to fine-tune both the planning and execution modules. Importantly, neither module relies on real-time access to an LLM during inference, significantly reducing the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging and multi-task interactive text environment, our method surpasses standard imitation learning based solely on elementary actions by 16.7% (absolute). Our analysis highlights the efficiency of our approach compared to other LLM-based methods. Recently, Large Language Models (LLMs) have found applications in various fields, including multi-task learning, decision making, answering questions, summarizing documents, translating languages, completing sentences, and serving as search assistants. The promising advantage of LLMs is attributed to their training on extensive text datasets, resulting in impressive capabilities. This prior knowledge can be leveraged for action planning to solve tasks in robotics and reinforcement learning (Huang et al., 2022b; Brohan et al., 2023; Liang et al., 2023). However, the extreme size of LLMs makes them computationally unaffordable for many applications. Consequently, there is an increasing demand to find approaches that are less computationally intensive while still capitalizing on the knowledge embedded in LLMs. One prevalent technique is the use of Knowledge Distillation (KD) (Buciluǎ et al., 2006; Hinton et al., 2015), wherein a smaller model is trained with guidance from a larger model. Through this approach, we can leverage the knowledge in an LLM to train a more compact model with a reduced number of parameters. First, focus on the substance. Figure 1: Example of annotating an expert trajectory with sub-goals for a particular variation of task 1-4 We employ Knowledge Distillation from an LLM to train (change-the-state-of-matter-of).
arXiv.org Artificial Intelligence
May-4-2024
- Country:
- North America > Canada > Quebec (0.14)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Education (0.47)