Self-Adapting Language Models

Neural Information Processing Systems 

Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce $\textbf{Se}$lf-$\textbf{A}$dapting $\textbf{L}$LMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a $\textit{self-edit}$ --- a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates.