SwitchCIT: Switching for Continual Instruction Tuning of Large Language Models
Wu, Xinbo, Hartman, Max, Jayaraman, Vidhata Arjun, Varshney, Lav R.
–arXiv.org Artificial Intelligence
Large language models (LLMs) have demonstrated remarkable capabilities across numerous domains, as highlighted by OpenAI (2023) and Bubeck et al. (2023). However, whereas LLMs pre-trained on extensive language data excel in general language understanding, they may not be optimized for every specific task of interest prompted by instructions. Therefore, there is need for continual instruction learning to adapt LLMs to evolving tasks and domains. Indeed, continual instruction learning is essential for LLMs such as GPT (Radford et al., 2019) to maintain their effectiveness and relevance in handling a wide range of tasks and domains. Such models are trained on vast amounts of text data and fine-tuned for specific applications, often by learning tasks sequentially (Luo et al., 2023), i.e. learning on datasets pertaining to one task all at once, before moving on to the next task. The challenge lies in their ability to continually learn and adapt as they encounter new tasks and information.
arXiv.org Artificial Intelligence
Jul-16-2024
- Country:
- North America > United States > Illinois (0.14)
- Genre:
- Research Report (0.50)
- Industry:
- Education > Educational Setting (0.34)
- Technology: