Plotting

 Cui, Peng


Perceiving Group Themes from Collective Social and Behavioral Information

AAAI Conferences

Collective social and behavioral information commonly exists in nature. There is a widespread intuitive sense that the characteristics of these social and behavioral information are to some extend related to the themes (or semantics) of the activities or targets. In this paper, we explicitly validate the interplay of collective social behavioral information and group themes using a large scale real dataset of online groups, and demonstrate the possibility of perceiving group themes from collective social and behavioral information. We propose a REgularized miXEd Regression (REXER) model based on matrix factorization to infer hierarchical semantics (including both group category and group labels) from collective social and behavioral information of group members. We extensively evaluate the proposed method in a large scale real online group dataset. For the prediction of group themes, the proposed REXER achieves satisfactory performances in various criterions. More specifically, we can predict the category of a group (among 6 categories) purely based on the collective social and behavioral information of the group with the Precision@1 to be 55.16% , without any assistance from group labels or conversation contents. We also show, perhaps counterintuitively, that the collective social and behavioral information is more reliable than the titles and labels of groups for inferring the group categories.


Probabilistic Attributed Hashing

AAAI Conferences

Due to the simplicity and efficiency, many hashing methods have recently been developed for large-scale similarity search. Most of the existing hashing methods focus on mapping low-level features to binary codes, but neglect attributes that are commonly associated with data samples. Attribute data, such as image tag, product brand, and user profile, can represent human recognition better than low-level features. However, attributes have specific characteristics, including high-dimensional, sparse and categorical properties, which is hardly leveraged into the existing hashing learning frameworks. In this paper, we propose a hashing learning framework, Probabilistic Attributed Hashing (PAH), to integrate attributes with low-level features. The connections between attributes and low-level features are built through sharing a common set of latent binary variables, i.e. hash codes, through which attributes and features can complement each other. Finally, we develop an efficient iterative learning algorithm, which is generally feasible for large-scale applications. Extensive experiments and comparison study are conducted on two public datasets, i.e., DBLP and NUS-WIDE. The results clearly demonstrate that the proposed PAH method substantially outperforms the peer methods.