Asia
Line Orthogonality in Adjacency Eigenspace with Application to Community Partition
Wu, Leting (University of North Carolina at Charlotte) | Ying, Xiaowei (University of North Carolina at Charlotte) | Wu, Xintao (University of North Carolina at Charlotte) | Zhou, Zhi-Hua (Nanjing University)
Different from Laplacian or normal matrix, the properties of the adjacency eigenspace received much less attention. Recent work showed that nodes projected into the adjacency eigenspace exhibit an orthogonal line pattern and nodes from the same community locate along the same line. In this paper, we conduct theoretical studies based on graph perturbation to demonstrate why this line orthogonality property holds in the adjacency eigenspace and why it generally disappears in the Laplacian and normal eigenspaces. Using the orthogonality property in the adjacency eigenspace, we present a graph partition algorithm, AdjCluster, which first projects node coordinates to the unit sphere and then applies the classic k-means to find clusters. Empirical evaluations on synthetic data and real-world social networks validate our theoretical findings and show the effectiveness of our graph partition algorithm.
Matching Large Ontologies Based on Reduction Anchors
Wang, Peng (Southeast University) | Zhou, Yuming (Nanjing University) | Xu, Baowen (Nanjing University)
Matching large ontologies is a challenge due to the high time complexity. This paper proposes a new matching method for large ontologies based on reduction anchors. This method has a distinct advantage over the divide-and-conquer methods because it dose not need to partition large ontologies. In particular, two kinds of reduction anchors, positive and negative reduction anchors, are proposed to reduce the time complexity in matching. Positive reduction anchors use the concept hierarchy to predict the ignorable similarity calculations. Negative reduction anchors use the locality of matching to predict the ignorable similarity calculations. Our experimental results on the real world data sets show that the proposed method is efficient for matching large ontologies.
A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogosphere
Tang, Jintao (National University of Defense Technology) | Wang, Ting (National University of Defense Technology) | Lu, Qin (Hong Kong Polytechnic University) | Wang, Ji (National Laboratory for Parallel and Distributed Processing) | Li, Wenjie (Hong Kong Polytechnic University)
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the named entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through a graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested using real-world blog data. Experimental results show the advantage of the proposed method on tracking topics in short, noisy text.
Transfer Learning to Predict Missing Ratings Via Heterogeneous User Feedbacks
Pan, Weike (Hong Kong University of Science and Technology) | Liu, Nathan N. (Hong Kong University of Science and Technology) | Xiang, Evan W. (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)
Data sparsity due to missing ratings is a major challenge for collaborative filtering (CF) techniques in recommender systems. This is especially true for CF domains where the ratings are expressed numerically. We observe that, while we may lack the information in numerical ratings, we may have more data in the form of binary ratings. This is especially true when users can easily express themselves with their likes and dislikes for certain items. In this paper, we explore how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings. We do this by transferring the rating knowledge from some auxiliary data source in binary form (that is, likes or dislikes), to a target numerical rating matrix. Our solution is to model both numerical ratings and like/dislike in a principled way, using a novel framework of Transfer by Collective Factorization (TCF). In particular, we construct the shared latent space collectively and learn the data-dependent effect separately. A major advantage of the TCF approach over previous collective matrix factorization (or bi-factorization) methods is that we are able to capture the data-dependent effect when sharing the data-independent knowledge, so as to increase the overall quality of knowledge transfer. Experimental results demonstrate the effectiveness of TCF at various sparsity levels as compared to several state-of-the-art methods.
Minimally Complete Recommendations
McSherry, David (University of Ulster)
Recent research has highlighted the benefits of completeness as a retrieval criterion in recommender systems. In complete retrieval, any subset of the constraints in a given query that can be satisfied must be satisfied by at least one of the retrieved products. Minimal completeness (i.e., always retrieving the smallest set of products needed for completeness) is also beginning to attract research interest as a way to minimize cognitive load in the approach. Other important features of a retrieval algorithm’s behavior include the diversity of the retrieved products and the order in which they are presented to the user. In this paper, we present a new algorithm for minimally complete retrieval (MCR) in which the ranking of retrieved products is primarily based on the number of constraints that they satisfy, but other measures such as similarity or utility can also be used to inform the retrieval process. We also present theoretical and empirical results that demonstrate our algorithm’s ability to minimize cognitive load while ensuring the completeness and diversity of the retrieved products.
Cross-Domain Collaborative Filtering over Time
Li, Bin (University of Technology, Sydney) | Zhu, Xingquan (University of Technology, Sydney) | Li, Ruijiang (Fudan University) | Zhang, Chengqi (University of Technology, Sydney) | Xue, Xiangyang (Fudan University) | Wu, Xindong (University of Vermont)
Another example is items to users based on their historical ratings. In that, although many people don't like animations, they may real-world scenarios, user interests may drift over still have interests in emerging 3-D animations because of the time since they are affected by moods, contexts, fantastic 3-D visual effects. These observations show that, and pop culture trends. This leads to the fact that although many aspects of user interests can be found based a user's historical ratings comprise many aspects of on users' historical ratings, at a certain time slice, one user's user interests spanning a long time period. However, interest may only focus on one or a couple of aspects. Thus, at a certain time slice, one user's interest may the static CF methods built on the entire historical ratings are only focus on one or a couple of aspects. Thus, inadequate to capture user-interest drift. In order to track user CF techniques based on the entire historical ratings interests and create comprehensive user profiles such that different may recommend inappropriate items. In this paper, recommendation strategies can be used for consistenttaste we consider modeling user-interest drift over time users and changing-taste users, a CF method that can based on the assumption that each user has multiple model user interests over time is required.
Multi-Perspective Linking of News Articles within a Repository
Khurdiya, Arpit (TCS Innovation Labs) | Dey, Lipika (TCS Innovation Labs) | Raj, Nidhi (TCS Innovation Labs) | Haque, Sk. Mirajul (TCS Innovation Labs)
Given the number of online sources for news, the volumes of news generated are so daunting that gaining insight from these collections become impossible without some aid to link them. Semantic linking of news articles facilitates grouping of similar or relevant news stories together for ease of human consumption. For example, a political analyst may like to have a single view of all news articles that report visits of State heads of different countries to a single country to make an in-depth analytical report on the possible impacts of all associated events. It is likely that no news source links all the relevant news together. In this paper, we discuss a multi-resolution, multi-perspective news analysis system that can link news articles collected from diverse sources over a period of time. The distinctive feature of the proposed news linking system is its capability to simultaneously link news articles and stories at multiple levels of granularity. At the lowest level several articles reporting the same event are linked together. Higher level groupings are more contextual and semantic. We have deployed a range of algorithms that use statistical text processing and Natural Language Processing techniques. The system is incremental in nature and depicts how stories have evolved over time along with main actors and activities. It also illustrates how a single story diverges into multiple themes or multiple stories converge due to conceptual similarity. Accuracy of linking thematically and conceptually linked news articles are also presented.
Context Sensitive Topic Models for Author Influence in Document Networks
Kataria, Saurabh (The Pennsylvania State University) | Mitra, Prasenjit (The Pennsylvania State University) | Caragea, Cornelia (The Pennsylvania State University) | Giles, C. Lee (The Pennsylvania State University)
In a document network such as a citation network of scientific documents, web-logs etc., the content produced by authors exhibit their interest in certain topics. In addition some authors influence other authors' interests. In this work, we propose to model the influence of cited authors along with the interests of citing authors. Morover , we hypothesize that citations present in documents, the context surrounding the citation mention provides extra topical information about the cited authors. However, associating terms in the context to the cited authors remains an open problem. We propose novel document generation schemes that incorporate the context while simultaneously modeling the interests of citing authors and influence of the cited authors. Our experiments show significant improvements over baseline models for various evaluation criteria such as link prediction between document and cited author, and quantitatively explaining unseen text.
Fashion Coordinates Recommender System Using Photographs from Fashion Magazines
Iwata, Tomoharu (NTT) | Watanabe, Shinji (NTT) | Sawada, Hiroshi (NTT)
Fashion magazines contain a number of photographs of fashion models, and their clothing coordinates serve as useful references. In this paper, we propose a recommender system for clothing coordinates using full-body photographs from fashion magazines. The task is that, given a photograph of a fashion item (e.g. tops) as a query, to recommend a photograph of other fashion items (e.g. bottoms) that is appropriate to the query. With the proposed method, we use a probabilistic topic model for learning information about coordinates from visual features in each fashion item region. We demonstrate the effectiveness of the proposed method using real photographs from a fashion magazine and two fashion style sharing services with the task of making top (bottom) recommendations given bottom (top) photographs.
A Convex Formulation of Modularity Maximization for Community Detection
Chan, Yun Kwan (Hong Kong University of Science and Technology) | Yeung, Dit-Yan (Hong Kong University of Science and Technology)
Complex networks pervade in diverse areas ranging from the natural world to the engineered world and from traditional application domains to new and emerging domains, including web-based social networks. Of crucial importance to the understanding of many network phenomena, dynamics and functions is the study of network structural properties. One important type of network structure is known as community structure which refers to the existence of communities that are tightly knit local groups with relatively dense connections among their members. Community detection is the problem of detecting these communities automatically. In this paper, based on the modularity measure proposed previously for community detection, we first propose a reformulation of an optimization problem for the 2-partition problem. Based on this new formulation, we can extend it naturally for tackling the general k-partition problem directly without having to tackle multiple 2-partition subproblems like what other methods do. We then propose a convex relaxation scheme to give an iterative algorithm which solves a simple quadratic program in each iteration. We empirically compare our method with some related methods and find that our method is both scalable and competitive in performance via maintaining a good tradeoff between efficiency and quality.