On the Role of Attention in Prompt-tuning

Oymak, Samet, Rawat, Ankit Singh, Soltanolkotabi, Mahdi, Thrampoulidis, Christos

Jun-6-2023–arXiv.org Artificial Intelligence

Recently, one of the key techniques that has helped pave the way for the Prompt-tuning is an emerging strategy to adapt deployment of transformers to ever increasing application large language models (LLM) to downstream areas is their ability to adapt to multiple unseen tasks by tasks by learning a (soft-)prompt parameter from conditioning their predictions through their inputs - a technique data. Despite its success in LLMs, there is limited known as prompt-tuning (Lester et al., 2021; Li & theoretical understanding of the power of Liang, 2021). Concretely, prompt-tuning provides a more prompt-tuning and the role of the attention mechanism efficient (cheaper/faster) alternative to fine-tuning the entire in prompting. In this work, we explore weights of the transformer by instead training (fewer) prompt-tuning for one-layer attention architectures so-called prompt parameters that are appended on the input and study contextual mixture-models where and can be thought of as an input interface. In fact, several each input token belongs to a context-relevant recent works have demonstrated experimentally that prompttuning or -irrelevant set. We isolate the role of prompttuning is not only more efficient, but often even becomes through a self-contained prompt-attention competitive to fine-tuning in terms of accuracy (Lester et al., model. Our contributions are as follows: (1) We 2021; Liu et al., 2023). However, there is currently limited show that softmax-prompt-attention is provably formal justification of such observations. This motivates the more expressive than softmax-self-attention and first question of this paper: linear-prompt-attention under our contextual data How does prompt-tuning compare to fine-tuning in terms of model.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jun-6-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.14)
  - Hawaii (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found