Language Model Pre-Training with Sparse Latent Typing

Ren, Liliang, Zhang, Zixuan, Wang, Han, Voss, Clare R., Zhai, Chengxiang, Ji, Heng

Oct-26-2022–arXiv.org Artificial Intelligence

Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-26-2022

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - New York (0.05)
    - Illinois (0.05)
    - Montana (0.04)
    - Minnesota (0.04)
- Europe
  - France (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Philippines (0.04)
  - China > Hong Kong (0.04)
  - Afghanistan > Kabul Province
    - Kabul (0.04)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Health & Medicine > Therapeutic Area (0.68)
- Government
  - Military (0.68)
  - Regional Government > North America Government
    - United States Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (0.69)
  - Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found