Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

Wang, Ximei, Jin, Ying, Long, Mingsheng, Wang, Jianmin, Jordan, Michael I.

Neural Information Processing Systems 

Deep neural networks (DNNs) excel at learning representations when trained on large-scale datasets. Pre-trained DNNs also show strong transferability when fine-tuned to other labeled datasets. However, such transferability becomes weak when the target dataset is fully unlabeled as in Unsupervised Domain Adaptation (UDA). We envision that the loss of transferability may stem from the intrinsic limitation of the architecture design of DNNs. In this paper, we delve into the components of DNN architectures and propose Transferable Normalization (TransNorm) in place of existing normalization techniques.