MVP: Multi-task Supervised Pre-training for Natural Language Generation
Tang, Tianyi, Li, Junyi, Zhao, Wayne Xin, Wen, Ji-Rong
–arXiv.org Artificial Intelligence
Pre-trained language models (PLMs) have achieved remarkable success in natural language generation (NLG) tasks. Up to now, most NLG-oriented PLMs are pre-trained in an unsupervised manner using the large-scale general corpus. In the meanwhile, an increasing number of models pre-trained with labeled data (i.e. "supervised pre-training") showcase superior performance compared to unsupervised pre-trained models. Motivated by the success of supervised pre-training, we propose Multi-task superVised Pre-training (MVP) for natural language generation. We collect a large-scale natural language generation corpus, MVPCorpus, from $77$ datasets over $11$ diverse NLG tasks. Then we unify these examples into a general text-to-text format to pre-train the text generation model MVP in a supervised manner. For each task, we further pre-train specific soft prompts to stimulate the model's capacity to perform a specific task. Our MVP model can be seen as a practice that utilizes recent instruction tuning on relatively small PLMs. Extensive experiments have demonstrated the effectiveness and generality of our MVP model in a number of NLG tasks, which achieves state-of-the-art performance on $13$ out of $17$ datasets, outperforming BART by $9.3\%$ and Flan-T5 by $5.8\%$.
arXiv.org Artificial Intelligence
May-28-2023
- Country:
- Oceania > Australia
- North America
- Dominican Republic (0.04)
- United States
- Washington > King County
- Seattle (0.14)
- Texas
- Taylor County > Abilene (0.04)
- Travis County > Austin (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > San Diego County
- San Diego (0.04)
- Washington > King County
- Canada
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Europe
- Czechia > Prague (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany
- Saarland > Saarbrücken (0.04)
- North Rhine-Westphalia > Düsseldorf Region
- Düsseldorf (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Spain
- Galicia > Madrid (0.05)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Catalonia > Barcelona Province
- Barcelona (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Netherlands > South Holland
- The Hague (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Singapore (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Middle East
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.14)
- Palestine > Gaza Strip
- Gaza Governorate > Gaza (0.04)
- Israel > Jerusalem District
- Jerusalem (0.04)
- UAE > Abu Dhabi Emirate
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- China
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Law (1.00)
- Government > Regional Government (1.00)
- Consumer Products & Services (0.68)
- Education (0.67)
- Transportation
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology (0.45)
- Technology: