Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets

Apr-1-2024–arXiv.org Artificial Intelligence

Current soft prompt methods yield limited performance when applied to small-sized models (fewer than a billion parameters). Deep prompt-tuning, which entails prepending parameters in each layer for enhanced efficacy, presents a solution for prompting small-sized models, albeit requiring carefully designed implementation. In this paper, we introduce the Lottery Ticket Prompt-learning (LTP) framework that integrates winning tickets with soft prompts. The LTP offers a simpler implementation and requires only a one-time execution. We demonstrate LTP on cross-lingual tasks, where prior works rely on external tools like human-designed multilingual templates and bilingual dictionaries, which may not be feasible in a low-resource regime. Specifically, we select a subset of parameters that have been changed the most during the fine-tuning with the Masked Language Modeling objective. Then, we prepend soft prompts to the original pre-trained language model and only update the selected parameters together with prompt-related parameters when adapting to the downstream tasks. We verify the effectiveness of our LTP framework on cross-lingual tasks, specifically targeting low-resource languages. Our approach outperforms the baselines by only updating 20\% of the original parameters.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Apr-1-2024

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- North America > United States
  - Louisiana (0.14)

Genre:
- Contests & Prizes (0.92)
- Research Report > Promising Solution (0.48)

Industry:
- Leisure & Entertainment > Games (0.71)
- Media > Film (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found