TapWeight: Reweighting Pretraining Objectives for Task-Adaptive Pretraining

Zhang, Ruiyi, Somayajula, Sai Ashish, Xie, Pengtao

arXiv.org Artificial Intelligence 

Large-scale general domain pretraining followed by downstream-specific finetuning has become a predominant paradigm in machine learning. However, discrepancies between the pretraining and target domains can still lead to performance degradation in certain cases, underscoring the need for task-adaptive continued pretraining (TAP). TAP methods typically involve continued pretraining on task-specific unlabeled datasets or introducing additional unsupervised learning objectives to enhance model capabilities. While many TAP methods perform continued pretraining with multiple pretraining objectives, they often determine the tradeoff parameters between objectives manually, resulting in suboptimal outcomes and higher computational costs. In this paper, we propose TapWeight, a task-adaptive pretraining framework which automatically determines the optimal importance of each pretraining objective based on downstream feedback. We applied TapWeight to both molecular property prediction and natural language understanding tasks, significantly surpassing baseline methods. Our code is publicly available at https://anonymous.4open.science/ Foundation models pretrained on large-scale general domain corpora have achieved state-of-theart performance across a wide range of tasks (He et al., 2021; Devlin et al., 2019; Brown et al., 2020). These models, which capture general knowledge for specific modalities such as text or images through unsupervised learning, are typically adapted to downstream tasks via finetuning. However, when there is a domain discrepancy between the pretraining corpus and the target task, direct finetuning of the pretrained model often fails to deliver optimal results (Lee et al., 2020; Chen et al., 2023; Xie et al., 2024).