Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Genewein, Tim, Li, Kevin Wenliang, Grau-Moya, Jordi, Ruoss, Anian, Orseau, Laurent, Hutter, Marcus

arXiv.org Machine Learning 

Perhaps the most impressive feature of today's frontier models is their ability to swiftly adapt their behavior to a wide range of contexts. Given relatively few tokens--whether from a user input, a system prompt, or a number of in-context examples--models often rapidly infer the task at hand and produce good continuations without any weight adaptation (in-context learning, Lampinen et al. [2024]). From a meta-learning perspective, rapid in-context adaptation is expected to arise: log loss minimization with a parametric sequential predictor (like a neural network) over a distribution of stochastic data generators leads to a Bayesian predictor for the pretraining distribution [Ortega et al., 2019]. The hallmark feature of such a predictor (Bayes-optimality) is most rapid in-context adaptation and least (cumulative) prediction error on average. Accordingly, prompting, that is conditioning of the Bayesian predictor, can be used to data-efficiently adapt the pretrained model to a target task. An important question is: under which conditions is it possible to find a prompt such that the prompted pretrained predictor becomes (near-) Bayes-optimal on a target task? We refer to this as optimal prompting, which is possible in theory if the target task is one of the tasks covered by the meta-distribution. If this is not the case, then optimal prompting may not be possible for an ideal predictor, and weight adaptation may be necessary.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found