Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

Nie, Ercong, Yuan, Shuzhou, Ma, Bolei, Schmid, Helmut, Färber, Michael, Kreuter, Frauke, Schütze, Hinrich

Feb-28-2024–arXiv.org Artificial Intelligence

Despite the predominance of English in their training data, English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities. This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence labeling tasks. Diverging from the single text-to-text prompt, our method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We assess our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, utilizing both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Further analysis reveals the influence of evaluation methods and the use of instructions in prompts. Our multilingual investigation shows that English-centric language models perform better on average than multilingual models. Our study offers insights into the multilingual transferability of English-centric LLMs, contributing to the understanding of their multilingual linguistic knowledge.

computational linguistic, decom, part-of-speech tag, (15 more...)

arXiv.org Artificial Intelligence

Feb-28-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Niger (0.04)
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - New York > New York County
      - New York City (0.04)
    - Maryland > Prince George's County
      - College Park (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Greece (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Germany
    - Berlin (0.04)
    - Bavaria > Upper Bavaria
      - Munich (0.04)
    - Baden-Württemberg > Karlsruhe Region
      - Karlsruhe (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
- Asia
  - Singapore (0.04)
  - Southeast Asia (0.04)
  - Indonesia > Bali (0.04)
  - India (0.04)
  - Middle East
    - Syria (0.04)
    - Jordan (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)
  - China > Heilongjiang Province
    - Harbin (0.04)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found