Asia
Multi-Label Learning on Tensor Product Graph
Jiang, Jonathan (City University of Hong Kong)
A large family of graph-based semi-supervised algorithms have been developed intuitively and pragmatically for the multi-label learning problem. These methods, however, only implicitly exploited the label correlation, as either part of graph weight or an additional constraint, to improve overall classification performance. Despite their seemingly quite different formulations, we show that all existing approaches can be uniformly referred to as a Label Propagation (LP) or Random Walk with Restart (RWR) on a Cartesian Product Graph (CPG). Inspired by this discovery, we introduce a new framework for multi-label classification task, employing the Tensor Product Graph (TPG) โ the tensor product of the data graph with the class (label) graph โ in which not only the intra-class but also the inter-class associations are explicitly represented as weighted edges among graph vertices. In stead of computing directly on TPG, we derive an iterative algorithm, which is guaranteed to converge and with the same computational complexity and the same amount of storage as the standard label propagation on the original data graph. Applications to four benchmark multi-label data sets illustrate that our method outperforms several state-of-the-art approaches.
Multi-Label Learning by Exploiting Label Correlations Locally
Huang, Sheng-Jun (Nanjing University) | Zhou, Zhi-Hua (Nanjing University)
It is well known that exploiting label correlations is important for multi-label learning. Existing approaches typically exploit label correlations globally, by assuming that the label correlations are shared by all the instances. In real-world tasks, however, different instances may share different label correlations, and few correlations are globally applicable. In this paper, we propose the ML-LOC approach which allows label correlations to be exploited locally. To encode the local influence of label correlations, we derive a LOC code to enhance the feature representation of each instance. The global discrimination fitting and local correlation sensitivity are incorporated into a unified framework, and an alternating solution is developed for the optimization. Experimental results on a number of image, text and gene data sets validate the effectiveness of our approach.
Sparse Principal Component Analysis with Constraints
Grbovic, Mihajlo (Temple University) | Dance, Christopher Roger (Xerox Research Centre Europe) | Vucetic, Slobodan (Temple University)
The sparse principal component analysis is a variant of the classical principal component analysis, which finds linear combinations of a small number of features that maximize variance across data. In this paper we propose a methodology for adding two general types of feature grouping constraints into the original sparse PCA optimization procedure.We derive convex relaxations of the considered constraints, ensuring the convexity of the resulting optimization problem. Empirical evaluation on three real-world problems, one in process monitoring sensor networks and two in social networks, serves to illustrate the usefulness of the proposed methodology.
Efficient Multi-Stage Conjugate Gradient for Trust Region Step
Gong, Pinghua (Tsinghua University) | Zhang, Changshui (Tsinghua University)
The trust region step problem, by solving a sphere constrained quadratic programming, plays a critical role in the trust region Newton method. In this paper, we propose an efficient Multi-Stage Conjugate Gradient (MSCG) algorithm to compute the trust region step in a multi-stage manner. Specifically, when the iterative solution is in the interior of the sphere, we perform the conjugate gradient procedure. Otherwise, we perform a gradient descent procedure which points to the inner of the sphere and can make the next iterative solution be a interior point. Subsequently, we proceed with the conjugate gradient procedure again. We repeat the above procedures until convergence. We also present a theoretical analysis which shows that the MSCG algorithm converges. Moreover, the proposed MSCG algorithm can generate a solution in any prescribed precision controlled by a tolerance parameter which is the only parameter we need. Experimental results on large-scale text data sets demonstrate our proposed MSCG algorithm has a faster convergence speed compared with the state-of-the-art algorithms.
A Bayesian Approach to the Data Description Problem
Ghasemi, Alireza (Ecole Polytechnique Federale de Lausanne (EPFL)) | Rabiee, Hamid R. (Sharif University of Technology) | Manzuri, Mohammad Taghi (Sharif University of Technology) | Rohban, Mohammad Hossein (Sharif University of Technology)
In this paper, we address the problem of data description using a Bayesian framework. The goal of data description is to draw a boundary around objects of a certain class of interest to discriminate that class from the rest of the feature space. Data description is also known as one-class learning and has a wide range of applications. The proposed approach uses a Bayesian framework to precisely compute the class boundary and therefore can utilize domain information in form of prior knowledge in the framework. It can also operate in the kernel space and therefore recognize arbitrary boundary shapes. Moreover, the proposed method can utilize unlabeled data in order to improve accuracy of discrimination. We evaluate our method using various real-world datasets and compare it with other state of the art approaches of data description. Experiments show promising results and improved performance over other data description and one-class learning algorithms.
Clustering Documents Along Multiple Dimensions
Dasgupta, Sajib (IBM Almaden Research Center) | Golden, Richard M. (University of Texas at Dallas) | Ng, Vincent (University of Texas at Dallas)
Traditional clustering algorithms are designed to search for a single clustering solution despite the fact that multiple alternative solutions might exist for a particular dataset. For example, a set of news articles might be clustered by topic or by the author's gender or age. Similarly, book reviews might be clustered by sentiment or comprehensiveness. In this paper, we address the problem of identifying alternative clustering solutions by developing a Probabilistic Multi-Clustering (PMC) model that discovers multiple, maximally different clusterings of a data sample. Empirical results on six datasets representative of real-world applications show that our PMC model exhibits superior performance to comparable multi-clustering algorithms.
Weighted Clustering
Ackerman, Margareta (University of Waterloo) | Ben-David, Shai (University of Waterloo) | Brรขnzei, Simina (Aarhus University) | Loker, David (University of Waterloo)
We investigate a natural generalization of the classical clustering problem, considering clustering tasks in which different instances may have different weights. We conduct the first extensive theoretical analysis on the influence of weighted data on standard clustering algorithms in both the partitional and hierarchical settings, characterizing the conditions under which algorithms react to weights. Extending a recent framework for clustering algorithm selection, we propose intuitive properties that would allow users to choose between clustering algorithms in the weighted setting and classify algorithms accordingly.
Towards Population Scale Activity Recognition: A Framework for Handling Data Diversity
Abdullah, Saeed (Cornell University) | Lane, Nicholas D. (Microsoft Research Asia) | Choudhury, Tanzeem (Cornell University)
The rising popularity of the sensor-equipped smartphone is changing the possible scale and scope of human activity inference. The diversity in user population seen in large user bases can overwhelm conventional one-size-fits-all classi๏ฌcation approaches. Although personalized models are better able to handle population diversity, they often require increased effort from the end user during training and are computationally expensive. In this paper, we propose an activity classification framework that is scalable and can tractably handle an increasing number of users. Scalability is achieved by maintaining distinct groups of similar users during the training process, which makes it possible to account for the differences between users without resorting to training individualized classifiers. The proposed framework keeps user burden low by leveraging crowd-sourced data labels, where simple natural language processing techniques in combination with multi-instance learning are used to handle labeling errors introduced by low-commitment everyday users. Experiment results on a large public dataset demonstrate that the framework can cope with population diversity irrespective of population size.
A Well-Founded Semantics for Basic Logic Programs with Arbitrary Abstract Constraint Atoms
Wang, Yisong (Guizhou University) | Lin, Fangzhen (Hong Kong University of Science and Technology) | Zhang, Mingyi (Guizhou Academy of Sciences) | You, Jia-Huai (University of Alberta)
Logic programs with abstract constraint atoms proposed by Marek and Truszczynski are very general logic programs.They are general enough to captureaggregate logic programs as well asrecently proposed description logic programs.In this paper, we propose a well-founded semantics for basic logic programs with arbitrary abstract constraint atoms, which are sets of rules whose heads have exactly one atom. Weshow that similar to the well-founded semanticsof normal logic programs, it has many desirable properties such as that it can becomputed in polynomial time, and is always correct with respect to theanswer set semantics. This paves the way for using our well-founded semanticsto simplify these logic programs. We also show how our semantics can be applied toaggregate logic programs and description logic programs, and compare itto the well-founded semantics already proposed for these logic programs.
Exploring the Duality in Conflict-Directed Model-Based Diagnosis
Stern, Roni Tzvi (Ben Gurion University of the Negev) | Kalech, Meir (Ben Gurion University of the Negev) | Feldman, Alexander (University College Cork) | Provan, Gregory (University College Cork)
A model-based diagnosis problem occurs when an observation is inconsistent with the assumption that the diagnosed system is not faulty. The task of a diagnosis engine is to compute diagnoses, which are assumptions on the health of components in the diagnosed system that explain the observation. In this paper, we extend Reiter's well-known theory of diagnosis by exploiting the duality of the relation between conflicts and diagnoses. This duality means that a diagnosis is a hitting set of conflicts, but a conflict is also a hitting set of diagnoses. We use this property to interleave the search for diagnoses and conflicts: a set of conflicts can guide the search for diagnosis, and the computed diagnoses can guide the search for more conflicts. We provide the formal basis for this dual conflict-diagnosis relation, and propose a novel diagnosis algorithm that exploits this duality. Experimental results show that the new algorithm is able to find a minimal cardinality diagnosis faster than the well-known Conflict-Directed A*.