Language-Independent Representations Improve Zero-Shot Summarization

Solovyev, Vladimir, Liu, Danni, Niehues, Jan

Apr-8-2024–arXiv.org Artificial Intelligence

Finetuning pretrained models on downstream generation tasks often leads to catastrophic forgetting in zero-shot conditions. In this work, we focus on summarization and tackle the problem through the lens of language-independent representations. After training on monolingual summarization, we perform zero-shot transfer to new languages or language pairs. We first show naively finetuned models are highly language-specific in both output behavior and internal representations, resulting in poor zero-shot performance. Next, we propose query-key (QK) finetuning to decouple task-specific knowledge from the pretrained language generation abilities. Then, after showing downsides of the standard adversarial language classifier, we propose a balanced variant that more directly enforces language-agnostic representations. Moreover, our qualitative analyses show removing source language identity correlates to zero-shot summarization performance. Our code is openly available.

computational linguistic, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

Apr-8-2024

arXiv.org PDF

Add feedback

Country:
- Asia (0.93)
- Europe (1.00)
- North America > United States (0.93)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found