Goto

Collaborating Authors

 Asia


Perceiving Group Themes from Collective Social and Behavioral Information

AAAI Conferences

Collective social and behavioral information commonly exists in nature. There is a widespread intuitive sense that the characteristics of these social and behavioral information are to some extend related to the themes (or semantics) of the activities or targets. In this paper, we explicitly validate the interplay of collective social behavioral information and group themes using a large scale real dataset of online groups, and demonstrate the possibility of perceiving group themes from collective social and behavioral information. We propose a REgularized miXEd Regression (REXER) model based on matrix factorization to infer hierarchical semantics (including both group category and group labels) from collective social and behavioral information of group members. We extensively evaluate the proposed method in a large scale real online group dataset. For the prediction of group themes, the proposed REXER achieves satisfactory performances in various criterions. More specifically, we can predict the category of a group (among 6 categories) purely based on the collective social and behavioral information of the group with the Precision@1 to be 55.16% , without any assistance from group labels or conversation contents. We also show, perhaps counterintuitively, that the collective social and behavioral information is more reliable than the titles and labels of groups for inferring the group categories.


On Information Coverage for Location Category Based Point-of-Interest Recommendation

AAAI Conferences

Point-of-interest(POI) recommendation becomes a valuable service in location-based social networks. Based on the norm that similar users are likely to have similar preference of POIs, the current recommendation techniques mainly focus on users' preference to provide accurate recommendation results. This tends to generate a list of homogeneous POIs that are clustered into a narrow band of location categories(like food, museum, etc.) in a city. However, users are more interested to taste a wide range of flavors that are exposed in a global set of location categories in the city.In this paper, we formulate a new POI recommendation problem, namely top-K location category based POI recommendation, by introducing information coverage to encode the location categories of POIs in a city.The problem is NP-hard. We develop a greedy algorithm and further optimization to solve this challenging problem. The experimental results on two real-world datasets demonstrate the utility of new POI recommendations and the superior performance of the proposed algorithms.


VELDA: Relating an Image Tweetโ€™s Text and Images

AAAI Conferences

Image tweets are becoming a prevalent form of socialmedia, but little is known about their content โ€” textualand visual โ€” and the relationship between the two mediums.Our analysis of image tweets shows that while visualelements certainly play a large role in image-text relationships, other factors such as emotional elements, also factor into the relationship. We develop Visual-Emotional LDA (VELDA), a novel topic model to capturethe image-text correlation from multiple perspectives (namely, visual and emotional). Experiments on real-world image tweets in both Englishand Chinese and other user generated content, show that VELDA significantly outperforms existingmethods on cross-modality image retrieval. Even in other domains where emotion does not factor in imagechoice directly, our VELDA model demonstrates good generalization ability, achieving higher fidelity modeling of such multimedia documents.


Will You "Reconsume" the Near Past? Fast Prediction on Short-Term Reconsumption Behaviors

AAAI Conferences

The short-term reconsumption behaviors, i.e. โ€œreconsumeโ€ the near past, account for a large proportion of peopleโ€™s activities every day and everywhere. In this paper, we firstly derived four generic features which influence peopleโ€™s short-term reconsumption behaviors. These features were extracted with respect to different roles in the process of reconsumption behaviors, i.e. users, items and interactions. Then, we brought forward two fast algorithms with the linear and the quadratic kernels to predict whether a user will perform a short-term reconsumption at a specific time given the context. The experimental results show that our proposed algorithms are more accurate in the prediction tasks compared with the baselines. Meanwhile, the time complexity of online prediction of our algorithms is O(1), which enables fast prediction in real-world scenarios. The prediction contributes to more intelligent decision-making, e.g. potential revisited customer identification, personalized recommendation, and information re-finding.


A Personalized Interest-Forgetting Markov Model for Recommendations

AAAI Conferences

Intelligent item recommendation is a key issue in AI research which enables recommender systems to be more โ€œhuman-mindedโ€ when generating recommendations. However, one of the major features of human โ€” forgetting, has barely been discussed as regards recommender systems. In this paper, we considered peopleโ€™s forgetting of interest when performing personalized recommendations, and brought forward a personalized framework to integrate interest-forgetting property with Markov model. Multiple implementations of the framework were investigated and compared. The experimental evaluation showed that our methods could significantly improve the accuracy of item recommendation, which verified the importance of considering interest-forgetting in recommendations.


Efficient Top-k Shortest-Path Distance Queries on Large Networks by Pruned Landmark Labeling

AAAI Conferences

We propose an indexing scheme for top-k shortest-path distance queries on graphs, which is useful in a wide range of important applications such as network-aware search and link prediction. While considerable effort has been made for efficiently answering standard (top-1) distance queries, none of previous methods can be directly extended for top-k distance queries. We propose a new framework for top-k distance queries based on 2-hop cover and then present an efficient indexing algorithm based on the simple but effective recent notion of pruned landmark labeling. Extensive experimental results on real social and web graphs show the scalability, efficiency and robustness of our method. Moreover, we demonstrate the usefulness of top-k distance queries through an application to link prediction.


Equilibrium Points of an AND-OR Tree: under Constraints on Probability

arXiv.org Artificial Intelligence

We study a probability distribution d on the truth assignments to a uniform binary AND-OR tree. Liu and Tanaka [2007, Inform. Process. Lett.] showed the following: If d achieves the equilibrium among independent distributions (ID) then d is an independent identical distribution (IID). We show a stronger form of the above result. Given a real number r such that 0 < r < 1, we consider a constraint that the probability of the root node having the value 0 is r. Our main result is the following: When we restrict ourselves to IDs satisfying this constraint, the above result of Liu and Tanaka still holds. The proof employs clever tricks of induction. In particular, we show two fundamental relationships between expected cost and probability in an IID on an OR-AND tree: (1) The ratio of the cost to the probability (of the root having the value 0) is a decreasing function of the probability x of the leaf. (2) The ratio of derivative of the cost to the derivative of the probability is a decreasing function of x, too.


Connectedness of graphs and its application to connected matroids through covering-based rough sets

arXiv.org Artificial Intelligence

Graph theoretical ideas are highly utilized by computer science fields especially data mining. In this field, a data structure can be designed in the form of tree. Covering is a widely used form of data representation in data mining and covering-based rough sets provide a systematic approach to this type of representation. In this paper, we study the connectedness of graphs through covering-based rough sets and apply it to connected matroids. First, we present an approach to inducing a covering by a graph, and then study the connectedness of the graph from the viewpoint of the covering approximation operators. Second, we construct a graph from a matroid, and find the matroid and the graph have the same connectedness, which makes us to use covering-based rough sets to study connected matroids. In summary, this paper provides a new approach to studying graph theory and matroid theory.


Dependence space of matroids and its application to attribute reduction

arXiv.org Artificial Intelligence

Attribute reduction is a basic issue in knowledge representation and data mining. Rough sets provide a theoretical foundation for the issue. Matroids generalized from matrices have been widely used in many fields, particularly greedy algorithm design, which plays an important role in attribute reduction. Therefore, it is meaningful to combine matroids with rough sets to solve the optimization problems. In this paper, we introduce an existing algebraic structure called dependence space to study the reduction problem in terms of matroids. First, a dependence space of matroids is constructed. Second, the characterizations for the space such as consistent sets and reducts are studied through matroids. Finally, we investigate matroids by the means of the space and present two expressions for their bases. In a word, this paper provides new approaches to study attribute reduction.


Statistical modality tagging from rule-based annotations and crowdsourcing

arXiv.org Machine Learning

We explore training an automatic modality tagger. Modality is the attitude that a speaker might have toward an event or state. One of the main hurdles for training a linguistic tagger is gathering training data. This is particularly problematic for training a tagger for modality because modality triggers are sparse for the overwhelming majority of sentences. We investigate an approach to automatically training a modality tagger where we first gathered sentences based on a high-recall simple rule-based modality tagger and then provided these sentences to Mechanical Turk annotators for further annotation. We used the resulting set of training data to train a precise modality tagger using a multi-class SVM that delivers good performance.