Goto

Collaborating Authors

 Zhejiang University


Multi-Label Community-Based Question Classification via Personalized Sequence Memory Network Learning

AAAI Conferences

Multi-label community-based question classification is a challenging problem in Community-based Question Answering (CQA) services, arising in many real applications such as question navigation and expert finding. Most of the existing approaches consider the problem as content-based tag suggestion task, which suffers from the textual sparsity issue. Unlike the previous studies, we consider the problem of multi-label community-based question classification from the viewpoint of personalized sequence learning. We introduce the personalized sequence memory network that leverages not only the semantics of questions but also the personalized information of askers to provide the sequence tag learning function to capture the high-order tag dependency. The experiment on real-world dataset shows the effectiveness of our method.


Distance-Aware DAG Embedding for Proximity Search on Heterogeneous Graphs

AAAI Conferences

Proximity search on heterogeneous graphs aims to measure the proximity between two nodes on a graph w.r.t. some semantic relation for ranking. Pioneer work often tries to measure such proximity by paths connecting the two nodes. However, paths as linear sequences have limited expressiveness for the complex network connections. In this paper, we explore a more expressive DAG (directed acyclic graph) data structure for modeling the connections between two nodes. Particularly, we are interested in learning a representation for the DAGs to encode the proximity between two nodes. We face two challenges to use DAGs, including how to efficiently generate DAGs and how to effectively learn DAG embedding for proximity search. We find distance-awareness as important for proximity search and the key to solve the above challenges. Thus we develop a novel Distance-aware DAG Embedding (D2AGE) model. We evaluate D2AGE on three benchmark data sets with six semantic relations, and we show that D2AGE outperforms the state-of-the-art baselines. We release the code on https://github.com/shuaiOKshuai.


PixelLink: Detecting Scene Text via Instance Segmentation

AAAI Conferences

Most state-of-the-art scene text detection algorithms are deep learning based methods that depend on bounding box regression and perform at least two kinds of predictions: text/non-text classification and location regression. Regression plays a key role in the acquisition of bounding boxes in these methods, but it is not indispensable because text/non-text prediction can also be considered as a kind of semantic segmentation that contains full location information in itself. However, text instances in scene images often lie very close to each other, making them very difficult to separate via semantic segmentation. Therefore, instance segmentation is needed to address this problem. In this paper, PixelLink, a novel scene text detection algorithm based on instance segmentation, is proposed. Text instances are first segmented out by linking pixels within the same instance together. Text bounding boxes are then extracted directly from the segmentation result without location regression. Experiments show that, compared with regression-based methods, PixelLink can achieve better or comparable performance on several benchmarks, while requiring many fewer training iterations and less training data.


Dynamic Network Embedding by Modeling Triadic Closure Process

AAAI Conferences

Network embedding, which aims to learn the low-dimensional representations of vertices, is an important task and has attracted considerable research efforts recently. In real world, networks, like social network and biological networks, are dynamic and evolving over time. However, almost all the existing network embedding methods focus on static networks while ignore network dynamics. In this paper, we present a novel representation learning approach, DynamicTriad, to preserve both structural information and evolution patterns of a given network. The general idea of our approach is to impose triad, which is a group of three vertices and is one of the basic units of networks. In particular, we model how a closed triad, which consists of three vertices connected with each other, develops from an open triad that has two of three vertices not connected with each other. This triadic closure process is a fundamental mechanism in the formation and evolution of networks, thereby makes our model being able to capture the network dynamics and to learn representation vectors for each vertex at different time steps. Experimental results on three real-world networks demonstrate that, compared with several state-of-the-art techniques, DynamicTriad achieves substantial gains in several application scenarios. For example, our approach can effectively be applied and help to identify telephone frauds in a mobile network, and to predict whether a user will repay her loans or not in a loan network.


Reinforcement Learning for Relation Classification From Noisy Data

AAAI Conferences

Existing relation classification methods that rely on distant supervision assume that a bag of sentences mentioning an entity pair are all describing a relation for the entity pair. Such methods, performing classification at the bag level, cannot identify the mapping between a relation and a sentence, and largely suffers from the noisy labeling problem. In this paper, we propose a novel model for relation classification at the sentence level from noisy data. The model has two modules: an instance selector and a relation classifier. The instance selector chooses high-quality sentences with reinforcement learning and feeds the selected sentences into the relation classifier, and the relation classifier makes sentence-level prediction and provides rewards to the instance selector. The two modules are trained jointly to optimize the instance selection and relation classification processes.Experiment results show that our model can deal with the noise of data effectively and obtains better performance for relation classification at the sentence level.


Search Action Sequence Modeling With Long Short-Term Memory for Search Task Success Evaluation

AAAI Conferences

Search task success rate is a crucial metric based on the search experience of users to measure the performance of search systems. Modeling search action sequence would help to capture the latent search patterns of users in successful and unsuccessful search tasks. Existing approaches use aggregated features to describe the user behavior in search action sequences, which depend on heuristic hand-crafted feature design and ignore a lot of information inherent in the user behavior. In this paper, we employ Long Short-Term Memory (LSTM) that performs end-to-end fine-tuning during the training to learn search action sequence representation for search task success evaluation. Concretely, we normalize the search action sequences by introducing a dummy idle action, which guarantees that the time intervals between contiguous actions are fixed. Simultaneously, we propose a novel data augmentation strategy to increase the pattern variations on search action sequence data to improve the generalization ability of LSTM. We evaluate the proposed approach on open datasets with two different definitions of search task success. The experimental results show that the proposed approach achieves significant performance improvement compared with several excellent search task success evaluation approaches.


Unsupervised Articulated Skeleton Extraction From Point Set Sequences Captured by a Single Depth Camera

AAAI Conferences

How to robustly and accurately extract articulated skeletons from point set sequences captured by a single consumer-grade depth camera still remains to be an unresolved challenge to date. To address this issue, we propose a novel, unsupervised approach consisting of three contributions (steps): (i) a non-rigid point set registration algorithm to first build one-to-one point correspondences among the frames of a sequence; (ii) a skeletal structure extraction algorithm to generate a skeleton with reasonable numbers of joints and bones; (iii) a skeleton joints estimation algorithm to achieve accurate joints. At the end, our method can produce a quality articulated skeleton from a single 3D point sequence corrupted with noise and outliers. The experimental results show that our approach soundly outperforms state of the art techniques, in terms of both visual quality and accuracy.


Urban Dreams of Migrants: A Case Study of Migrant Integration in Shanghai

AAAI Conferences

Unprecedented human mobility has driven the rapid urbanization around the world. In China, the fraction of population dwelling in cities increased from 17.9% to 52.6% between 1978 and 2012. Such large-scale migration poses challenges for policymakers and important questions for researchers. To investigate the process of migrant integration, we employ a one-month complete dataset of telecommunication metadata in Shanghai with 54 million users and 698 million call logs. We find systematic differences between locals and migrants in their mobile communication networks and geographical locations. For instance, migrants have more diverse contacts and move around the city with a larger radius than locals after they settle down. By distinguishing new migrants (who recently moved to Shanghai) from settled migrants (who have been in Shanghai for a while), we demonstrate the integration process of new migrants in their first three weeks. Moreover, we formulate classification problems to predict whether a person is a migrant. Our classifier is able to achieve an F1-score of 0.82 when distinguishing settled migrants from locals, but it remains challenging to identify new migrants because of class imbalance. This classification setup holds promise for identifying new migrants who will successfully integrate into locals (new migrants that misclassified as locals).


FR-ANet: A Face Recognition Guided Facial Attribute Classification Network

AAAI Conferences

In this paper, we study the problem of facial attribute learning. In particular, we propose a Face Recognition guided facial Attribute classification Network, called FR-ANet. All the attributes share low-level features, while high-level features are specially learned for attribute groups. Further, to utilize the identity information, high-level features are merged to perform face identity recognition. The experimental results on CelebA and LFWA datasets demonstrate the promise of the FR-ANet.


Representation Learning for Scale-Free Networks

AAAI Conferences

Network embedding aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the network is preserved. Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored. Scale-free property depicts the fact that vertex degrees follow a heavy-tailed distribution (i.e., only a few vertexes have high degrees) and is a critical property of real-world networks, such as social networks. In this paper, we study the problem of learning representations for scale-free networks. We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the "degree penalty" principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively. Extensive experiments on six datasets show that our algorithms are able to not only reconstruct heavy-tailed distributed degree distribution, but also outperform state-of-the-art embedding models in various network mining tasks, such as vertex classification and link prediction.