Zhong, Erheng
Geometry Aware Mappings for High Dimensional Sparse Factors
Bhowmik, Avradeep, Liu, Nathan, Zhong, Erheng, Bhaskar, Badri Narayan, Rajan, Suju
While matrix factorisation models are ubiquitous in large scale recommendation and search, real time application of such models requires inner product computations over an intractably large set of item factors. In this manuscript we present a novel framework that uses the inverted index representation to exploit structural properties of sparse vectors to significantly reduce the run time computational cost of factorisation models. We develop techniques that use geometry aware permutation maps on a tessellated unit sphere to obtain high dimensional sparse embeddings for latent factors with sparsity patterns related to angular closeness of the original latent factors. We also design several efficient and deterministic realisations within this framework and demonstrate with experiments that our techniques lead to faster run time operation with minimal loss of accuracy.
Active Transfer Learning for Cross-System Recommendation
Zhao, Lili (The Hong Kong University of Science and Technology) | Pan, Sinno Jialin (Institute for Infocomm Research) | Xiang, Evan Wei (Baidu Inc.) | Zhong, Erheng (The Hong Kong University of Science and Technology) | Lu, Zhongqi (The Hong Kong University of Science and Technology) | Yang, Qiang (Huawei Noahโs Ark Lab)
Recommender systems, especially the newly launched ones, have to deal with the data-sparsity issue, where little existing rating information is available. Recently, transfer learning has been proposed to address this problem by leveraging the knowledge from related recommender systems where rich collaborative data are available. However, most previous transfer learning models assume that entity-correspondences across different systems are given as input, which means that for any entity (e.g., a user or an item) in a target system, its corresponding entity in a source system is known. This assumption can hardly be satisfied in real-world scenarios where entity-correspondences across systems are usually unknown, and the cost of identifying them can be expensive. For example, it is extremely difficult to identify whether a user A from Facebook and a user B from Twitter are the same person. In this paper, we propose a framework to construct entity correspondence with limited budget by using active learning to facilitate knowledge transfer across recommender systems. Specifically, for the purpose of maximizing knowledge transfer, we first iteratively select entities in the target system based on our proposed criterion to query their correspondences in the source system. We then plug the actively constructed entity-correspondence mapping into a general transferred collaborative-filtering model to improve recommendation quality. We perform extensive experiments on real world datasets to verify the effectiveness of our proposed framework for this cross-system recommendation problem.
Selective Transfer Learning for Cross Domain Recommendation
Lu, Zhongqi, Zhong, Erheng, Zhao, Lili, Xiang, Wei, Pan, Weike, Yang, Qiang
Collaborative filtering (CF) aims to predict users' ratings on items according to historical user-item preference data. In many real-world applications, preference data are usually sparse, which would make models overfit and fail to give accurate predictions. Recently, several research works show that by transferring knowledge from some manually selected source domains, the data sparseness problem could be mitigated. However for most cases, parts of source domain data are not consistent with the observations in the target domain, which may misguide the target domain model building. In this paper, we propose a novel criterion based on empirical prediction error and its variance to better capture the consistency across domains in CF settings. Consequently, we embed this criterion into a boosting framework to perform selective knowledge transfer. Comparing to several state-of-the-art methods, we show that our proposed selective transfer learning framework can significantly improve the accuracy of rating prediction tasks on several real-world recommendation tasks.
Discovering Spammers in Social Networks
Zhu, Yin (Hong Kong University of Science and Technology (HKUST)) | Wang, Xiao (Renren Inc.) | Zhong, Erheng (Hong Kong University of Science and Technology (HKUST)) | Liu, Nathan N. (Hong Kong University of Science and Technology (HKUST)) | Li, He (Renren Inc.) | Yang, Qiang (Hong Kong University of Science and Technology (HKUST))
As the popularity of the social media increases, as evidenced in Twitter, Facebook and China's Renren, spamming activities also picked up in numbers and variety. On social network sites, spammers often disguise themselves by creating fake accounts and hijacking normal users' accounts for personal gains. Different from the spammers in traditional systems such as SMS and email, spammers in social media behave like normal users and they continue to change their spamming strategies to fool anti spamming systems. However, due to the privacy and resource concerns, many social media websites cannot fully monitor all the contents of users, making many of the previous approaches, such as topology-based and content-classification-based methods, infeasible to use. In this paper, we propose a novel method for spammer detection in social networks that exploits both social activities as well as users' social relations in an innovative and highly scalable manner. The proposed method detects spammers following collective activities based on users' social actions and relations. We have empirically tested our method on data from Renren.com, which is the largest social network in China, and demonstrated that our new method can improve the detection performance significantly.