Goto

Collaborating Authors

 Statistical Learning


Integrating Clustering and Multi-Document Summarization by Bi-Mixture Probabilistic Latent Semantic Analysis (PLSA) with Sentence Bases

AAAI Conferences

Probabilistic Latent Semantic Analysis (PLSA) has been popularly used in document analysis. However, as it is currently formulated, PLSA strictly requires the number of word latent classes to be equal to the number of document latent classes. In this paper, we propose Bi-mixture PLSA, a new formulation of PLSA that allows the number of latent word classes to be different from the number of latent document classes. We further extend Bi-mixture PLSA to incorporate the sentence information, and propose Bi-mixture PLSA with sentence bases (Bi-PLSAS) to simultaneously cluster and summarize the documents utilizing the mutual influence of the document clustering and summarization procedures. Experiments on real-world datasets demonstrate the effectiveness of our proposed methods.


Exploiting Phase Transition in Latent Networks for Clustering

AAAI Conferences

In this paper, we model the pair-wise similarities of a setof documents as a weighted network with a single cutoffparameter. Such a network can be thought of an ensemble of unweighted graphs, each consisting of edges withweights greater than the cutoff value. We look at this network ensemble as a complex system with a temperature parameter, and refer to it as a Latent Network. Ourexperiments on a number of datasets from two different domains show that certain properties of latent networks like clustering coefficient, average shortest path,and connected components exhibit patterns that are significantly divergent from randomized networks. We explain that these patterns reflect the network phase transition as well as the existence of a community structure in document collections. Using numerical analysis,we show that we can use the aforementioned networkproperties to predicts the clustering Normalized MutualInformation (NMI) with high correlation (rho > 0.9). Finally we show that our clustering method significantlyoutperforms other baseline methods (NMI > 0.5)


Partially Supervised Text Classification with Multi-Level Examples

AAAI Conferences

Partially supervised text classification has received great research attention since it only uses positive and unlabeled examples as training data. This problem can be solved by automatically labeling some negative (and more positive) examples from unlabeled examples before training a text classifier. But it is difficult to guarantee both high quality and quantity of the new labeled examples. In this paper, a multi-level example based learning method for partially supervised text classification is proposed, which can make full use of all unlabeled examples. A heuristic method is proposed to assign possible labels to unlabeled examples and partition them into multiple levels according to their labeling confidence. A text classifier is trained on these multi-level examples using weighted support vector machines. Experiments show that the multi-level example based learning method is effective for partially supervised text classification, and outperforms the existing popular methods such as Biased-SVM, ROC-SVM, S-EM and WL.


Social Relations Model for Collaborative Filtering

AAAI Conferences

We propose a novel probabilistic model for collaborative filtering (CF), called SRMCoFi, which seamlessly integrates both linear and bilinear random effects into a principled framework. The formulation of SRMCoFi is supported by both social psychological experiments and statistical theories. Not only can many existing CF methods be seen as special cases of SRMCoFi, but it also integrates their advantages while simultaneously overcoming their disadvantages. The solid theoretical foundation of SRMCoFi is further supported by promising empirical results obtained in extensive experiments using real CF data sets on movie ratings.


Multi-Task Learning in Heterogeneous Feature Spaces

AAAI Conferences

Multi-task learning aims at improving the generalization performance of a learning task with the help of some other related tasks. Although many multi-task learning methods have been proposed, they are all based on the assumption that all tasks share the same data representation. This assumption is too restrictive for general applications. In this paper, we propose a multi-task extension of linear discriminant analysis (LDA), called multi-task discriminant analysis (MTDA), which can deal with learning tasks with different data representations. For each task, MTDA learns a separate transformation which consists of two parts, one specific to the task and one common to all tasks. A by-product of MTDA is that it can alleviate the labeled data deficiency problem of LDA. Moreover, unlike many existing multi-task learning methods, MTDA can handle binary and multi-class problems for each task in a generic way. Experimental results on face recognition show that MTDA consistently outperforms related methods.


Convex Sparse Coding, Subspace Learning, and Semi-Supervised Extensions

AAAI Conferences

Automated feature discovery is a fundamental problem in machine learning. Although classical feature discovery methods do not guarantee optimal solutions in general, it has been recently noted that certain subspace learning and sparse coding problems can be solved efficiently, provided the number of features is not restricted a priori. We provide an extended characterization of this optimality result and describe the nature of the solutions under an expanded set of practical contexts. In particular, we apply the framework to a semi-supervised learning problem, and demonstrate that feature discovery can co-occur with input reconstruction and supervised training while still admitting globally optimal solutions. A comparison to existing semi-supervised feature discovery methods shows improved generalization and efficiency.


Transfer Latent Semantic Learning: Microblog Mining with Less Supervision

AAAI Conferences

The increasing volume of information generated on micro-blogging sites such as Twitter raises several challenges to traditional text mining techniques. First, most texts from those sites are abbreviated due to the constraints of limited characters in one post; second, the input usually comes in streams of large-volumes. Therefore, it is of significant importance to develop effective and efficient representations of abbreviated texts for better filtering and mining. In this paper, we introduce a novel transfer learning approach, namely transfer latent semantic learning, that utilizes a large number of related tagged documents with rich information from other sources (source domain) to help build a robust latent semantic space for the abbreviated texts (target domain). This is achieved by simultaneously minimizing the document reconstruction error and the classification error of the labeled examples from the source domain by building a classifier with hinge loss in the latent semantic space. We demonstrate the effectiveness of our method by applying them to the task of classifying and tagging abbreviated texts. Experimental results on both synthetic datasets and real application datasets, including Reuters-21578 and Twitter data, suggest substantial improvements using our approach over existing ones.


Nonnegative Spectral Clustering with Discriminative Regularization

AAAI Conferences

Clustering is a fundamental research topic in the field of data mining. Optimizing the objective functions of clustering algorithms, e.g. normalized cut and k-means, is an NP-hard optimization problem. Existing algorithms usually relax the elements of cluster indicator matrix from discrete values to continuous ones. Eigenvalue decomposition is then performed to obtain a relaxed continuous solution, which must be discretized. The main problem is that the signs of the relaxed continuous solution are mixed. Such results may deviate severely from the true solution, making it a nontrivial task to get the cluster labels. To address the problem, we impose an explicit nonnegative constraint for a more accurate solution during the relaxation. Besides, we additionally introduce a discriminative regularization into the objective to avoid overfitting. A new iterative approach is proposed to optimize the objective. We show that the algorithm is a general one which naturally leads to other extensions. Experiments demonstrate the effectiveness of our algorithm.


Direct Density-Ratio Estimation with Dimensionality Reduction via Hetero-Distributional Subspace Analysis

AAAI Conferences

Methods for estimating the ratio of two probability density functions have been actively explored recently since they can be used for various data processing tasks such as non-stationarity adaptation, outlier detection, feature selection, and conditional probability estimation. In this paper, we propose a new density-ratio estimator which incorporates dimensionality reduction into the density-ratio estimation procedure. Through experiments, the proposed method is shown to compare favorably with existing density-ratio estimators in terms of both accuracy and computational costs.


Multi-Task Learning in Square Integrable Space

AAAI Conferences

Several kernel based methods for multi-task learning have been proposed, which leverage relations among tasks as regularization to enhance the overall learning accuracies. These methods assume that the tasks share the same kernel, which could limit their applications because in practice different tasks may need different kernels. The main challenge of introducing multiple kernels into multiple tasks is that models from different Reproducing Kernel Hilbert Spaces (RKHSs) are not comparable, making it difficult to exploit relations among tasks. This paper addresses the challenge by formalizing the problem in the Square Integrable Space (SIS). Specially, it proposes a kernel based method which makes use of a regularization term defined in the SIS to represent task relations. We prove a new representer theorem for the proposed approach in SIS. We further derive a practical method for solving the learning problem and conduct consistency analysis of the method. We discuss the relations between our method and an existing method. We also give an SVM based implementation of our method for multi-label classification. Experiments on two real-world data sets show that the proposed method performs better than the existing method.