Disentangling Latent Shifts of In-Context Learning Through Self-Training

Oct-2-2024–arXiv.org Artificial Intelligence

In-context learning (ICL) has become essential in natural language processing, particularly with autoregressive large language models capable of learning from demonstrations provided within the prompt. However, ICL faces challenges with stability and long contexts, especially as the number of demonstrations grows, leading to poor generalization and inefficient inference. The student model exhibits weak-to-strong generalization, progressively refining its predictions over time. In-context learning (ICL) (Brown et al., 2020) has emerged as a significant machine learning paradigm, particularly in natural language processing (NLP) applications that utilize large language models (LLMs). Unlike traditional supervised machine learning methods that rely on training over multiple epochs with large datasets, ICL leverages the ability of autoregressive LLMs to learn from context, with demonstrations and the query combined in a single prompt. This enables models to rapidly adjust to new tasks or varying input patterns without the need for additional fine-tuning. Moreover, ICL proves effective in low-resource setups by utilizing zero-shot and few-shot learning to perform tasks with minimal or no supervision (Dong et al., 2024a). Despite its strengths, ICL faces several critical challenges.

demonstration, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Oct-2-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Belgium (0.14)
  - Croatia (0.14)
- North America > Canada (0.14)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)
  - Natural Language > Large Language Model (1.00)