Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning
Wu, Hongqiu, Ding, Ruixue, Zhao, Hai, Chen, Boli, Xie, Pengjun, Huang, Fei, Zhang, Min
–arXiv.org Artificial Intelligence
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling, which serves the ultimate purpose of pre-trained language models (PrLMs), generalizing well on a mass of scenarios. However, learning multiple training objectives in a single model is challenging due to the unknown relative significance as well as the potential contrariety between them. Empirical studies have shown that the current objective sampling in an ad-hoc manual setting makes the learned language representation barely converge to the desired optimum. Thus, we propose \textit{MOMETAS}, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives. Such a design is lightweight with negligible additional training overhead. To validate our approach, we adopt five objectives and conduct continual pre-training with BERT-base and BERT-large models, where MOMETAS demonstrates universal performance gain over other rule-based sampling strategies on 14 natural language processing tasks.
arXiv.org Artificial Intelligence
Oct-19-2022
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Colorado > Denver County
- Denver (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Utah > Salt Lake County
- Puerto Rico > San Juan
- San Juan (0.04)
- Canada
- Europe
- France (0.04)
- Belgium (0.04)
- Austria (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy > Tuscany
- Florence (0.04)
- Iceland > Capital Region
- Reykjavik (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Asia
- Middle East > Jordan (0.04)
- China > Shanghai
- Shanghai (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- Oceania > Australia
- Genre:
- Instructional Material (1.00)
- Research Report (0.82)
- Technology: