ImprovedFine-TuningbyBetterLeveraging Pre-TrainingData

Neural Information Processing Systems 

As a dominant paradigm, fine-tuning a pre-trained model on the target data is widely used in many deep learning applications, especially for small data sets.