Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Ansell, Alan, Ponti, Edoardo Maria, Korhonen, Anna, Vulić, Ivan
–arXiv.org Artificial Intelligence
Fine-tuning the entire set of parameters of a large pretrained model has become the mainstream approach for transfer learning. To increase its efficiency and prevent catastrophic forgetting and interference, techniques like adapters and sparse fine-tuning have been developed. Adapters are modular, as they can be combined to adapt a model towards different facets of knowledge (e.g., dedicated language and/or task adapters). Sparse fine-tuning is expressive, as it controls the behavior of all model components. In this work, we introduce a new fine-tuning method with both these desirable properties. In particular, we learn sparse, real-valued masks based on a simple variant of the Lottery Ticket Hypothesis. Task-specific masks are obtained from annotated data in a source language, and language-specific masks from masked language modeling in a target language. Both these masks can then be composed with the pretrained model. Unlike adapter-based fine-tuning, this method neither increases the number of parameters at inference time nor alters the original model architecture. Most importantly, it outperforms adapters in zero-shot cross-lingual transfer by a large margin in a series of multilingual benchmarks, including Universal Dependencies, MasakhaNER, and AmericasNLI. Based on an in-depth analysis, we additionally find that sparsity is crucial to prevent both 1) interference between the fine-tunings to be composed and 2) overfitting. We release the code and models at https://github.com/cambridgeltl/composable-sft.
arXiv.org Artificial Intelligence
Feb-9-2023
- Country:
- Africa > Niger (0.05)
- South America > Peru (0.04)
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Wisconsin > Milwaukee County
- Milwaukee (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Wisconsin > Milwaukee County
- Canada > Quebec
- Montreal (0.04)
- Europe
- Slovenia (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- Italy > Tuscany
- Florence (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Spain
- Aragón (0.04)
- Catalonia > Barcelona Province
- Barcelona (0.04)
- Romania > Sud - Muntenia Development Region
- Giurgiu County > Giurgiu (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- China > Hong Kong (0.04)
- Middle East
- Israel (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- Japan > Honshū
- Tōhoku > Iwate Prefecture > Morioka (0.04)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Leisure & Entertainment > Gambling (0.35)
- Technology: