AITopics | unified language model pre-training

Collaborating Authors

unified language model pre-training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsDec-25-2025, 22:49:34 GMT

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on. UniLM compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.

absolute improvement, name change, unified language model pre-training, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.43)

Add feedback

Unified Language Model Pre-training for Natural Language Understanding and Generation

Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

Neural Information Processing SystemsAug-20-2025, 01:19:40 GMT

Neural Information Processing Systems http://nips.cc/

computational linguistic, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Reviews: Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsFeb-11-2025, 23:15:39 GMT

This paper provides a method to pretrain a single Transformer architecture on three objectives: (i) unidirectional language model (e.g. This unified architecture circumvents the shortcoming of both models like BERT (which can condition on bidirectional context, but harder to use for downstream tasks that involve generation due to bidirectionality) and GPT-2 (easy to apply for generation tasks since it works left-to-right, but bidirectional encoders have been known to work much better than unidirectional ones in sequence-to-sequence models), and thereby combines the best of both worlds. This is done using a simple masking scheme that restricts which words the model can pay attention to, depending on which objective function is used (e.g. if using a unidirectional, left-to-right objective, then all tokens to the right of the target word are masked out). Experiments on text summarisation (CNN/DailyMail and Gigaword), question answering (SQuAD, CoQA extractive, and CoQA abstractive), question generation, and GLUE indicate that the proposed pretraining approach largely matches or surpasses the current state of the art. Their masking approach crucially enables pretraining the two key ingredients of sequence-to-sequence models with a single model: (i) a bidirectional encoder, and (ii) a unidirectional decoder.

objective, sequence-to-sequence model, unified language model pre-training, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.40)

Add feedback

Reviews: Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsFeb-11-2025, 23:15:28 GMT

This paper presents an alternative training regime for the BERT contextual embedding model that incorporates additional conditioning contexts such as left to right language modelling and sequence transduction. The reviewers agree that the work is well motivated and is a reasonable attempt to address some of the issues with the original BERT model. The results are suitably strong, and as such this paper is likely to be of interest to those working on contextual embedding models, although it is puzzling that a classic language modelling perplexity evaluation was not included, given this is one of the objectives that the model optimises. The author's final paper should incorporate the answers to the questions raised by the reviewers.

contextual, reviewer, unified language model pre-training

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.40)

Add feedback

Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsJan-26-2025, 21:51:13 GMT

The code and pre-trained models are available at https://github.com/microsoft/unilm.

absolute improvement, abstractive summarization rouge-l, unified language model pre-training, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.45)

Add feedback

Unified Language Model Pre-training for Natural Language Understanding and Generation

Dong, Li, Yang, Nan, Wang, Wenhui, Wei, Furu, Liu, Xiaodong, Wang, Yu, Gao, Jianfeng, Zhou, Ming, Hon, Hsiao-Wuen

Neural Information Processing SystemsMar-19-2020, 02:01:50 GMT

The code and pre-trained models are available at https://github.com/microsoft/unilm. Papers published at the Neural Information Processing Systems Conference.

absolute improvement, abstractive summarization rouge-l, unified language model pre-training

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.45)

Add feedback