Goto

Collaborating Authors

 Asia


MUVIR: Multi-View Rare Category Detection

AAAI Conferences

Rare category detection refers to the problem of identifying the initial examples from underrepresented minority classes in an imbalanced dataset. This problem becomes more challenging in many real applications where the data comes from multiple views, and some views may be irrelevant for distinguishing between majority and minority classes, such as synthetic ID detection and insider threat detection. Existing techniques for rare category detection are not best suited for such applications,as they mainly focus on data with a single view. To address the problem of multi-view rare category detection, in this paper, we propose a novel framework named MUVIR. It builds upon existing techniques for rare category detection with each single view, and exploits the relationship among multiple views to estimate the overall probability of each example belonging to the minority class. In particular,we study multiple special cases of the framework with respect to their working conditions, and analyze the performance of MUVIR in the presence of irrelevant views. For problems where the exact priors of the minority classes are unknown, we generalize the MUVIR algorithm to work with only an upper bound on the priors. Experimental results on both synthetic and real data sets demonstrate the effectiveness of the proposed framework, especially in the presence of irrelevant views.


Mobile Query Recommendation via Tensor Function Learning

AAAI Conferences

With the prevalence of mobile search nowadays, the benefits of mobile query recommendation are well recognized, which provide formulated queries sticking to users’ search intent. In this paper, we introduce the problem of query recommendation on mobile devices and model the user-location-query relations with a tensor representation. Unlike previous studies based on tensor decomposition, we study this problem via tensor function learning. That is, we learn the tensor function from the side information of users, locations and queries, and then predict users’ search intent. We develop an efficient alternating direction method of multipliers (ADMM) scheme to solve the introduced problem. We empirically evaluate our approach based on the mobile query dataset from Bing search engine in the city of Beijing, China, and show that our method can outperform several state-of-the-art approaches.


Multi-Task Multi-View Clustering for Non-Negative Data

AAAI Conferences

Multi-task clustering and multi-view clustering have severally found wide applications and received much attention in recent years. Nevertheless, there are many clustering problems that involve both multi-task clustering and multi-view clustering, i.e., the tasks are closely related and each task can be analyzed from multiple views. In this paper, for non-negative data (e.g., documents), we introduce a multi-task multi-view clustering (MTMVC) framework which integrates within-view-task clustering, multi-view relationship learning and multi-task relationship learning. We then propose a specific algorithm to optimize the MTMVC framework. Experimental results show the superiority of the proposed algorithm over either multi-task clustering algorithms or multi-view clustering algorithms for multi-task clustering of multi-view data.


Solving the Partial Label Learning Problem: An Instance-Based Approach

AAAI Conferences

In partial label learning, each training example is associated with a set of candidate labels, among which only one is valid. An intuitive strategy to learn from partial label examples is to treat all candidate labels equally and make prediction by averaging their modeling outputs. Nonetheless, this strategy may suffer from the problem that the modeling output from the valid label is overwhelmed by those from the false positive labels. In this paper, an instance-based approach named IPAL is proposed by directly disambiguating the candidate label set. Briefly, IPAL tries to identify the valid label of each partial label example via an iterative label propagation procedure, and then classifies the unseen instance based on minimum error reconstruction from its nearest neighbors. Extensive experiments show that IPAL compares favorably against the existing instance-based as well as other state-of-the-art partial label learning approaches.


Towards Class-Imbalance Aware Multi-Label Learning

AAAI Conferences

In multi-label learning, each object is represented by a single instance while associated with a set of class labels. Due to the huge (exponential) number of possible label sets for prediction, existing approaches mainly focus on how to exploit label correlations to facilitate the learning process. Nevertheless, an intrinsic characteristic of learning from multi-label data, i.e. the widely-existing class-imbalance among labels, has not been well investigated. Generally, the number of positive training instances w.r.t. each class label is far less than its negative counterparts, which may lead to performance degradation for most multi-label learning techniques. In this paper, a new multi-label learning approach named Cross-Coupling Aggregation (COCOA) is proposed, which aims at leveraging the exploitation of label correlations as well as the exploration of class-imbalance. Briefly, to induce the predictive model on each class label, one binary-class imbalance learner corresponding to the current label and several multi-class imbalance learners coupling with other labels are aggregated for prediction. Extensive experiments clearly validate the effectiveness of the proposed approach, especially in terms of imbalance-specific evaluation metrics such as F-measure and area under the ROC curve.


Instance-Wise Weighted Nonnegative Matrix Factorization for Aggregating Partitions with Locally Reliable Clusters

AAAI Conferences

We address an ensemble clustering problem, where reliable clusters are locally embedded in given multiple partitions. We propose a new nonnegative matrix factorization (NMF)-based method, in which locally reliable clusters are explicitly considered by using instance-wise weights over clusters. Our method factorizes the input cluster assignment matrix into two matrices H and W, which are optimized by iteratively 1) updating H and W while keeping the weight matrix constant and 2) updating the weight matrix while keeping H and W constant, alternatively. The weights in the second step were updated by solving a convex problem, which makes our algorithm significantly faster than existing NMF-based ensemble clustering methods. We empirically proved that our method outperformed a lot of cutting-edge ensemble clustering methods by using a variety of datasets.


Matrix Factorization with Scale-Invariant Parameters

AAAI Conferences

Tuning hyper-parameters for large-scale matrix factorization (MF) is very time consuming and sometimes unacceptable. Intuitively, we want to tune hyper-parameters on small sub-matrix sample and then exploit them into the original large-scale matrix. However, most of existing MF methods are scale-variant, which means  the optimal hyper-parameters usually change with the different scale of matrices. To this end, in this paper we propose a scale-invariant parametric MF method, where a set of scale-invariant parameters is defined for model complexity regularization. Therefore, the proposed method can free us from tuning hyper-parameters on large-scale matrix, and achieve a good performance in a more efficient way. Extensive experiments on real-world dataset clearly validate both the effectiveness and efficiency of our method.


Accelerated Inexact Soft-Impute for Fast Large-Scale Matrix Completion

AAAI Conferences

Matrix factorization tries to recover a low-rank matrix from limited observations. A state-of-the art algorithm is the Soft-Impute, which exploits a special “sparse plus low-rank” structure of the matrix iterates to allow efficient SVD in each iteration. Though Soft-Impute is also a proximal gradient algorithm, it is generally believed thatacceleration techniques are not useful and will destroy the special structure. In this paper, we show that Soft-Impute can indeed be accelerated without compromising the “sparse plus low-rank” structure. To further reduce the per-iteration time complexity, we propose an approximate singular value thresholding scheme based on the power method.Theoretical analysis shows that the proposed algorithm enjoys the fast O(1/T 2) convergence rate of accelerated proximal gradient algorithms. Extensive experiments on both synthetic and large recommendation data sets show that the proposed algorithm is much faster than Soft-Impute and other state-of-the-art matrix completion algorithms.


Deep Convolutional Neural Networks on Multichannel Time Series for Human Activity Recognition

AAAI Conferences

This paper focuses on human activity recognition (HAR) problem, in which inputs are multichannel time series signals acquired from a set of body-worn inertial sensors and outputs are predefined human activities. In this problem, extracting effective features for identifying activities is a critical but challenging task. Most existing work relies on heuristic hand-crafted feature design and shallow feature learning architectures, which cannot find those distinguishing features to accurately classify different activities. In this paper, we propose a systematic feature learning method for HAR problem. This method adopts a deep convolutional neural networks (CNN) to automate feature learning from the raw inputs in a systematic way. Through the deep architecture, the learned features are deemed as the higher level abstract representation of low level raw time series signals. By leveraging the labelled information via supervised learning, the learned features are endowed with more discriminative power. Unified in one model, feature learning and classification are mutually enhanced. All these unique advantages of the CNN make it outperform other HAR algorithms, as verified in the experiments on  the Opportunity Activity Recognition Challenge and other  benchmark datasets.


Ice-Breaking: Mitigating Cold-Start Recommendation Problem by Rating Comparison

AAAI Conferences

Recommender system has become an indispensable component in many e-commerce sites. One major challenge that largely remains open is the cold-start problem, which can be viewed as an ice barrier that keeps the cold-start users/items from the warm ones. In this paper, we propose a novel rating comparison strategy (RaPare) to break this ice barrier. The center-piece of  our RaPare is to provide a fine-grained calibration on the latent profiles of cold-start users/items by exploring the differences between cold-start and warm users/items. We instantiate our RaPare strategy on the prevalent method in recommender system, i.e., the matrix factorization based collaborative filtering. Experimental evaluations on two real data sets validate the superiority of our approach over the existing methods in cold-start scenarios.