Transfer Learning
Build Better Machine Learning Models in Less Time with Transfer Learning
Our control model was a well established machine learning model using features that are known to work well. For text, the features are essentially normalized word counts (TF-IDF: term frequency / inverse document frequency vectors). For images, we use HOG features (histogram of oriented gradients). These features were fed into a logistic regression model for training and prediction. Our test model used custom collection; we fed data, trained a model, and made a prediction using transfer learning for text and image analysis under the covers.
Transfer Hashing with Privileged Information
Zhou, Joey Tianyi, Xu, Xinxing, Pan, Sinno Jialin, Tsang, Ivor W., Qin, Zheng, Goh, Rick Siow Mong
Most existing learning to hash methods assume that there are sufficient data, either labeled or unlabeled, on the domain of interest (i.e., the target domain) for training. However, this assumption cannot be satisfied in some real-world applications. To address this data sparsity issue in hashing, inspired by transfer learning, we propose a new framework named Transfer Hashing with Privileged Information (THPI). Specifically, we extend the standard learning to hash method, Iterative Quantization (ITQ), in a transfer learning manner, namely ITQ+. In ITQ+, a new slack function is learned from auxiliary data to approximate the quantization error in ITQ. We developed an alternating optimization approach to solve the resultant optimization problem for ITQ+. We further extend ITQ+ to LapITQ+ by utilizing the geometry structure among the auxiliary data for learning more precise binary codes in the target domain. Extensive experiments on several benchmark datasets verify the effectiveness of our proposed approaches through comparisons with several state-of-the-art baselines.
Transfer Learning for Cross-Language Text Categorization through Active Correspondences Construction
Zhou, Joey Tianyi (Institute of High Performance Computing) | Pan, Sinno Jialin (Nanyang Technological University) | Tsang, Ivor W. (University of Technology) | Ho, Shen-Shyang (Nanyang Technological University)
Most existing heterogeneous transfer learning (HTL) methods for cross-language text classification rely on sufficient cross-domain instance correspondences to learn a mapping across heterogeneous feature spaces, and assume that such correspondences are given in advance. However, in practice, correspondences between domains are usually unknown. In this case, extensively manual efforts are required to establish accurate correspondences across multilingual documents based on their content and meta-information. In this paper, we present a general framework to integrate active learning to construct correspondences between heterogeneous domains for HTL, namely HTL through active correspondences construction (HTLA). Based on this framework, we develop a new HTL method. On top of the new HTL method, we further propose a strategy to actively construct correspondences between domains. Extensive experiments are conducted on various multilingual text classification tasks to verify the effectiveness of HTLA.
Learning by Transferring from Unsupervised Universal Sources
Wang, Yu-Xiong (Carnegie Mellon University) | Hebert, Martial (Carnegie Mellon University)
Category classifiers trained from a large corpus of annotated data are widely accepted as the sources for (hypothesis) transfer learning. Sources generated in this way are tied to a particular set of categories, limiting their transferability across a wide spectrum of target categories. In this paper, we address this largely-overlooked yet fundamental source problem by both introducing a systematic scheme for generating universal source hypotheses and proposing a principled, scalable approach to automatically tuning the transfer process. Our approach is based on the insights that expressive source hypotheses could be generated without any supervision and that a sparse combination of such hypotheses facilitates recognition of novel categories from few samples. We demonstrate improvements over the state-of-the-art on object and scene classification in the small sample size regime.
Active Learning with Cross-Class Knowledge Transfer
Guo, Yuchen (Tsinghua Univerisity) | Ding, Guiguang (Tsinghua University) | Wang, Yuqi (Tsinghua University) | Jin, Xiaoming (Tsinghua University)
When there are insufficient labeled samples for training a supervised model, we can adopt active learning to select the most informative samples for human labeling, or transfer learning to transfer knowledge from related labeled data source. Combining transfer learning with active learning has attracted much research interest in recent years. Most existing works follow the setting where the class labels in source domain are the same as the ones in target domain. In this paper, we focus on a more challenging cross-class setting where the class labels are totally different in two domains but related to each other in an intermediary attribute space, which is barely investigated before. We propose a novel and effective method that utilizes the attribute representation as the seed parameters to generate the classification models for classes. And we propose a joint learning framework that takes into account the knowledge from the related classes in source domain, and the information in the target domain. Besides, it is simple to perform uncertainty sampling, a fundamental technique for active learning, based on the framework. We conduct experiments on three benchmark datasets and the results demonstrate the efficacy of the proposed method.
Collective Noise Contrastive Estimation for Policy Transfer Learning
Zhang, Weinan (University College London) | Paquet, Ulrich (Microsoft Research) | Hofmann, Katja (Microsoft Research)
We address the problem of learning behaviour policies to optimise online metrics from heterogeneous usage data. While online metrics, e.g., click-through rate, can be optimised effectively using exploration data, such data is costly to collect in practice, as it temporarily degrades the user experience. Leveraging related data sources to improve online performance would be extremely valuable, but is not possible using current approaches. We formulate this task as a policy transfer learning problem, and propose a first solution, called collective noise contrastive estimation (collective NCE). NCE is an efficient solution to approximating the gradient of a log-softmax objective. Our approach jointly optimises embeddings of heterogeneous data to transfer knowledge from the source domain to the target domain. We demonstrate the effectiveness of our approach by learning an effective policy for an online radio station jointly from user-generated playlists, and usage data collected in an exploration bucket.
Instilling Social to Physical: Co-Regularized Heterogeneous Transfer Learning
Wei, Ying (Hong Kong University of Science and Technology) | Zhu, Yin (Hong Kong University of Science and Technology) | Leung, Cane Wing-ki (Wisers Research) | Song, Yangqiu (West Virginia University) | Yang, Qiang (Hong Kong University of Science and Technology)
Ubiquitous computing tasks, such as human activity recognition (HAR), are enabling a wide spectrum of applications, ranging from healthcare to environment monitoring. The success of a ubiquitous computing task relies on sufficient physical sensor data with groundtruth labels, which are always scarce due to the expensive annotating process. Meanwhile, social media platforms provide a lot of social or semantic context information. People share what they are doing and where they are frequently in the messages they post. This rich set of socially shared activities motivates us to transfer knowledge from social media to address the sparsity issue of labelled physical sensor data. In order to transfer the knowledge of social and semantic context, we propose a Co-Regularized Heterogeneous Transfer Learning (CoHTL) model, which builds a common semantic space derived from two heterogeneous domains. Our proposed method outperforms state-of-the-art methods on two ubiquitous computing tasks, namely human activity recognition and region function discovery.
Little Is Much: Bridging Cross-Platform Behaviors through Overlapped Crowds
Jiang, Meng (Tsinghua University) | Cui, Peng (Tsinghua University) | Yuan, Nicholas Jing (Microsoft Research Asia) | Xie, Xing (Microsoft Research Asia) | Yang, Shiqiang (Tsinghua University)
People often use multiple platforms to fulfill their different information needs. With the ultimate goal of serving people intelligently, a fundamental way is to get comprehensive understanding about user needs. How to organically integrate and bridge cross-platform information in a human-centric way is important. Existing transfer learning assumes either fully-overlapped or non-overlapped among the users. However, the real case is the users of different platforms are partially overlapped. The number of overlapped users is often small and the explicitly known overlapped users is even less due to the lacking of unified ID for a user across different platforms. In this paper, we propose a novel semi-supervised transfer learning method to address the problem of cross-platform behavior prediction, called XPTrans. To alleviate the sparsity issue, it fully exploits the small number of overlapped crowds to optimally bridge a user's behaviors in different platforms. Extensive experiments across two real social networks show that XPTrans significantly outperforms the state-of-the-art. We demonstrate that by fully exploiting 26% overlapped users, XPTrans can predict the behaviors of non-overlapped users with the same accuracy as overlapped users, which means the small overlapped crowds can successfully bridge the information across different platforms.
Learning to Learn
The ever-increasing pace of change in today's organizations requires that executives understand and then quickly respond to constant shifts in how their businesses operate and how work must get done. That means you must resist your innate biases against doing new things in new ways, scan the horizon for growth opportunities, and push yourself to acquire drastically different capabilities--while still doing your existing job. To succeed, you must be willing to experiment and become a novice over and over again, which for most of us is an extremely discomforting proposition. Over decades of work with managers, the author has found that people who do succeed at this kind of learning have four well-developed attributes: aspiration, self-awareness, curiosity, and vulnerability. They have a deep desire to understand and master new skills; they see themselves very clearly; they're constantly thinking of and asking good questions; and they tolerate their own mistakes as they move up the curve.
The Benefit of Multitask Representation Learning
Maurer, Andreas, Pontil, Massimiliano, Romera-Paredes, Bernardino
We discuss a general method to learn data representations from multiple tasks. We provide a justification for this method in both settings of multitask learning and learning-to-learn. The method is illustrated in detail in the special case of linear feature learning. Conditions on the theoretical advantage offered by multitask representation learning over independent task learning are established. In particular, focusing on the important example of half-space learning, we derive the regime in which multitask representation learning is beneficial over independent task learning, as a function of the sample size, the number of tasks and the intrinsic data dimensionality. Other potential applications of our results include multitask feature learning in reproducing kernel Hilbert spaces and multilayer, deep networks.