Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models

Kang, Haoqiang, Blevins, Terra, Zettlemoyer, Luke

Apr-26-2023–arXiv.org Artificial Intelligence

Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks such as translation and multilingual word sense disambiguation (WSD). However, they often struggle at disambiguating word sense in a zero-shot setting. To better understand this contrast, we present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT), an extension of word-level translation that prompts the model to translate a given word in context. We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance. Building on C-WLT, we introduce a zero-shot approach for WSD, tested on 18 languages from the XL-WSD dataset. Our method outperforms fully supervised baselines on recall for many evaluation languages without additional training or finetuning. This study presents a first step towards understanding how to best leverage the cross-lingual knowledge inside PLMs for robust zero-shot reasoning in any language.

artificial intelligence, natural language, translation, (15 more...)

arXiv.org Artificial Intelligence

Apr-26-2023

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)
- North America > United States (0.46)

Genre:
- Research Report > New Finding (0.68)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found