Goto

Collaborating Authors

 co-tuning





Response to Reviews of " Co-Tuning for Transfer Learning "

Neural Information Processing Systems

We thank all reviewers for their detailed reviews. However, the major technique is still feature fine-tuning ( a.k.a. In the following, we respond to common questions first and then to major concerns of each reviewer. Each dataset has a train/test split. Each method has access to the same set of training data.


Review for NeurIPS paper: Co-Tuning for Transfer Learning

Neural Information Processing Systems

I am changing my score up a bit General comments: 1) The process seems like a two step (as opposed learning end to end) - first derive the connection of source and target labels (train a separate network to do this), and then using this connection, train a target model while requiring the output (target labels) to conform to this derived connection. Are both steps happening on the same target dataset? Not clear whether it works when number of target classes is larger than number of source classes 3) Authors state that their setting is when source data is not available, but actually their calibration requires the source data. Alternatively, neural net g should be able to learn the calibration in theory, as long as enough complexity is used . Experiments: 1) A reasonable baseline would just be source model (full) one or several new layers for the target.


Review for NeurIPS paper: Co-Tuning for Transfer Learning

Neural Information Processing Systems

This paper presents a simple method which seems to work well in practice. Some reviewers would have preferred to see more discussion on limitations of the method. However, the contribution was deemed clear enough without this discussion, because of the intuitive, novel take on the popular fine-tuning task, as well as the strong performance demonstrated on popular vision tasks. Overall, the paper is expected to be of interest to the community.


Co-Tuning for Transfer Learning

Neural Information Processing Systems

Fine-tuning pre-trained deep neural networks (DNNs) to a target dataset, also known as transfer learning, is widely used in computer vision and NLP. Because task-specific layers mainly contain categorical information and categories vary with datasets, practitioners only \textit{partially} transfer pre-trained models by discarding task-specific layers and fine-tuning bottom layers. However, it is a reckless loss to simply discard task-specific parameters who take up as many as 20\% of the total parameters in pre-trained models. To \textit{fully} transfer pre-trained models, we propose a two-step framework named \textbf{Co-Tuning}: (i) learn the relationship between source categories and target categories from the pre-trained model and calibrated predictions; (ii) target labels (one-hot labels), as well as source labels (probabilistic labels) translated by the category relationship, collaboratively supervise the fine-tuning process. A simple instantiation of the framework shows strong empirical results in four visual classification tasks and one NLP classification task, bringing up to 20\% relative improvement. While state-of-the-art fine-tuning techniques mainly focus on how to impose regularization when data are not abundant, Co-Tuning works not only in medium-scale datasets (100 samples per class) but also in large-scale datasets (1000 samples per class) where regularization-based methods bring no gains over the vanilla fine-tuning.