Disentangling Latent Shifts of In-Context Learning Through Self-Training
–arXiv.org Artificial Intelligence
In-context learning (ICL) has become essential in natural language processing, particularly with autoregressive large language models capable of learning from demonstrations provided within the prompt. However, ICL faces challenges with stability and long contexts, especially as the number of demonstrations grows, leading to poor generalization and inefficient inference. The student model exhibits weak-to-strong generalization, progressively refining its predictions over time. In-context learning (ICL) (Brown et al., 2020) has emerged as a significant machine learning paradigm, particularly in natural language processing (NLP) applications that utilize large language models (LLMs). Unlike traditional supervised machine learning methods that rely on training over multiple epochs with large datasets, ICL leverages the ability of autoregressive LLMs to learn from context, with demonstrations and the query combined in a single prompt. This enables models to rapidly adjust to new tasks or varying input patterns without the need for additional fine-tuning. Moreover, ICL proves effective in low-resource setups by utilizing zero-shot and few-shot learning to perform tasks with minimal or no supervision (Dong et al., 2024a). Despite its strengths, ICL faces several critical challenges.
arXiv.org Artificial Intelligence
Oct-2-2024
- Country:
- Europe
- North America > Canada (0.14)
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Education (0.48)
- Technology: