Language Model Pre-Training with Sparse Latent Typing
Ren, Liliang, Zhang, Zixuan, Wang, Han, Voss, Clare R., Zhai, Chengxiang, Ji, Heng
–arXiv.org Artificial Intelligence
Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-trained with such an objective also significantly improves Information Extraction related downstream tasks in both supervised and few-shot settings. Our code is publicly available at: https://github.com/renll/SparseLT.
arXiv.org Artificial Intelligence
Oct-26-2022
- Country:
- Asia
- Afghanistan > Kabul Province
- Kabul (0.04)
- China > Hong Kong (0.04)
- Philippines (0.04)
- Afghanistan > Kabul Province
- Europe
- North America
- Oceania > Australia
- Asia
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Technology: