Wasserstein Distance Guided Representation Learning for Domain Adaptation

AAAI Conferences

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.


Shen

AAAI Conferences

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.


Semi-supervised representation learning via dual autoencoders for domain adaptation

arXiv.org Machine Learning

Domain adaptation which pays attention to exploiting the knowledge in source domain to promote the learning tasks in target domain plays a critical role in real-world applications. Recently, lots of deep learning approaches based on autoencoders have achieved significance performance in domain adaptation. However, most existing methods focus on minimizing the distribution divergence by putting the source data and target data together to learn global feature representations, while do not take the local relationship between instances of the same category in different domains into account. To address this problem, we propose a novel Semi-Supervised Representation Learning framework via Dual Autoencoders for domain adaptation, named SSRLDA. More specifically, \textcolor{red}{we extract richer feature representations by learning the global and local feature representations simultaneously using two novel autoencoders}, which are referred to as marginalized denoising autoencoder with adaptation distribution (MDA$_{ad}$) and multi-class marginalized denoising autoencoder (MMDA) respectively. Meanwhile, we \textcolor{red}{adopt an iterative strategy} to make full use of label information to optimize feature representations. Experimental results show that our proposed approach outperforms several state-of-the-art baseline methods.


Exploiting Local Feature Patterns for Unsupervised Domain Adaptation

arXiv.org Machine Learning

Unsupervised domain adaptation methods aim to alleviate performance degradation caused by domain-shift by learning domain-invariant representations. Existing deep domain adaptation methods focus on holistic feature alignment by matching source and target holistic feature distributions, without considering local features and their multi-mode statistics. We show that the learned local feature patterns are more generic and transferable and a further local feature distribution matching enables fine-grained feature alignment. In this paper, we present a method for learning domain-invariant local feature patterns and jointly aligning holistic and local feature statistics. Comparisons to the state-of-the-art unsupervised domain adaptation methods on two popular benchmark datasets demonstrate the superiority of our approach and its effectiveness on alleviating negative transfer.


Network Transfer Learning via Adversarial Domain Adaptation with Graph Convolution

arXiv.org Machine Learning

Abstract--This paper studies the problem of cross-network node classification to overcome the insufficiency of labeled data in a single network. It aims to leverage the label information in a partially labeled source network to assist node classification in a completely unlabeled or partially labeled target network. Existing methods for single network learning cannot solve this problem due to the domain shift across networks. Some multi-network learning methods heavily rely on the existence of cross-network connections, thus are inapplicable for this problem. T o tackle this problem, we propose a novel network transfer learning framework AdaGCN by leveraging the techniques of adversarial domain adaptation and graph convolution. It consists of two components: a semi-supervised learning component and an adversarial domain adaptation component. The former aims to learn class discriminative node representations with given label information of the source and target networks, while the latter contributes to mitigating the distribution divergence between the source and target domains to facilitate knowledge transfer. Extensive empirical evaluations on real-world datasets show that AdaGCN can successfully transfer class information with a low label rate on the source network and a substantial divergence between the source and target domains. Codes will be released upon acceptance. It is an important building block of numerous real-world applications, such as product recommendation in e-commerce websites, advertisement distribution in social networks, and protein function identification for disease diagnosis. Many research efforts have been made to develop reliable and efficient methods for node classification in networked data. In the era of big data, massive amount of raw data in information networks is produced everyday . However, labeled data is significantly expensive and slow to acquire due to the high cost and long time of human annotations, making it difficult to train a well-generalized classifier [2]. Moreover, in some newly-formed networks such as a protein-protein interaction network constructed by some researchers, there may be no labels at all. Hence, it would be impossible to classify the nodes with only the information of this network. T o tackle these issues, a promising approach is to utilize class information from other similar or related networks to assist in classification, i.e., transfer learning on networked data [3], [4].