Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models

Zhang, Ying, Li, Dongyuan, Okumura, Manabu

Aug-2-2024–arXiv.org Artificial Intelligence

Learning token embeddings based on token co-occurrence statistics has proven effective for both pre-training and fine-tuning in natural language processing. However, recent studies have pointed out the distribution of learned embeddings degenerates into anisotropy, and even pre-trained language models (PLMs) suffer from a loss of semantics-related information in embeddings for low-frequency tokens. This study first analyzes fine-tuning dynamics of a PLM, BART-large, and demonstrates its robustness against degeneration. On the basis of this finding, we propose DefinitionEMB, a method that utilizes definitions to construct isotropically distributed and semantics-related token embeddings for PLMs while maintaining original robustness during fine-tuning. Our experiments demonstrate the effectiveness of leveraging definitions from Wiktionary to construct such embeddings for RoBERTa-base and BART-large. Furthermore, the constructed embeddings for low-frequency tokens improve the performance of these models across various GLUE and four text summarization datasets.

bart, dataset, definitionemb, (15 more...)

arXiv.org Artificial Intelligence

Aug-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.04)
- Europe
  - Croatia (0.14)
  - Serbia (0.04)
  - Monaco (0.04)
  - Holy See (0.04)
  - Czechia (0.04)
- Asia > Japan
  - Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:
- Research Report
  - New Finding (0.46)
  - Experimental Study (0.46)

Industry:
- Leisure & Entertainment (0.93)
- Law (0.93)
- Media > Film (0.67)
- Government (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Text Processing (1.00)
    - Large Language Model (0.94)
    - Machine Translation (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found