Bi-level Unbalanced Optimal Transport for Partial Domain Adaptation

Chen, Zi-Ying, Ren, Chuan-Xian, Yan, Hong

arXiv.org Artificial Intelligence 

Partial domain adaptation (PDA) problem requires aligning cross-domain samples while distinguishing the outlier classes for accurate knowledge transfer. The widely used weighting framework tries to address the outlier classes by introducing the reweighed source domain with a similar label distribution to the target domain. However, the empirical modeling of weights can only characterize the sample-wise relations, which leads to insufficient exploration of cluster structures, and the weights could be sensitive to the inaccurate prediction and cause confusion on the outlier classes. To tackle these issues, we propose a Bi-level Unbalanced Optimal Transport (BUOT) model to simultaneously characterize the sample-wise and class-wise relations in a unified transport framework. Specifically, a cooperation mechanism between sample-level and class-level transport is introduced, where the sample-level transport provides essential structure information for the class-level knowledge transfer, while the class-level transport supplies discriminative information for the outlier identification. The bi-level transport plan provides guidance for the alignment process. By incorporating the label-aware transport cost, the local transport structure is ensured and a fast computation formulation is derived to improve the efficiency. Introduction Traditional machine learning usually follows the assumption that training data and test data come from the same distribution. Corresponding author Email address: rchuanx@mail.sysu.edu.cn This distribution discrepancy can degrade the performance of machine learning models when they are deployed in new environments or domains. To overcome this challenge, unsupervised domain adaptation (UDA) [1, 2] has been developed to transfer knowledge from the labeled source domain to the unlabeled target domain, enabling the models trained on the source domain that can generalize well to the target domain. Usually, UDA methods train the model using source domain samples to minimize the source domain classification error and then use appropriate methods to eliminate the cross-domain divergence.