Collaborating Authors

Bayesian models for Large-scale Hierarchical Classification

Neural Information Processing Systems

A challenging problem in hierarchical classification is to leverage the hierarchical relations among classes for improving classification performance. An even greater challenge is to do so in a manner that is computationally feasible for the large scale problems usually encountered in practice. This paper proposes a set of Bayesian methods to model hierarchical dependencies among class labels using multivari- ate logistic regression. Specifically, the parent-child relationships are modeled by placing a hierarchical prior over the children nodes centered around the parame- ters of their parents; thereby encouraging classes nearby in the hierarchy to share similar model parameters. We present new, efficient variational algorithms for tractable posterior inference in these models, and provide a parallel implementa- tion that can comfortably handle large-scale problems with hundreds of thousands of dimensions and tens of thousands of classes.

A New Search Engine Integrating Hierarchical Browsing and Keyword Search

AAAI Conferences

The original Yahoo! search engine consists of manually organized topic hierarchy of webpages for easy browsing. Modern search engines (such as Google and Bing), on the other hand, return a flat list of webpages based on keywords. It would be ideal if hierarchical browsing and keyword search can be seamlessly combined. The main difficulty in doing so is to automatically (i.e., not manually) classify and rank a massive number of webpages into various hierarchies (such as topics, media types, regions of the world). In this paper we report our attempt towards building this integrated search engine, called SEE (Search Engine with hiErarchy). We implement a hierarchical classification system based on Support Vector Machines, and embed it in SEE. We also design a novel user interface that allows users to dynamically adjust their desire for a higher accuracy vs. more results in any (sub)category of the hierarchy. Though our current search engine is still small (indexing about 1.2 million webpages), the results, including a small user study, have shown a great promise for integrating such techniques in the next-generation search engine.

A Hierarchical Model for Morphological Galaxy Classification

AAAI Conferences

We propose a new method for the morphological galaxy classification which incorporates two main contributions: (i) the generation of artificial images of galaxies through geometric transformations to be used as additional examples in the training phase, (ii) the use of a novel hierarchical classifier for hierarchical galaxy classification. An additional classifier distinguishes galaxies from stars based on geometrical moments. The proposed method was tested with two different astronomical databases. The results found show that the hierarchical classification method has a higher performance than flat classification, and that the use of artificial examples and oversampling provide a significant improvement in performance.

Chained Path Evaluation for Hierarchical Multi-Label Classification

AAAI Conferences

In this paper we propose a novel hierarchical multi-label clas- sification approach for tree and directed acyclic graph (DAG) hierarchies. The method predicts a single path (from the root to a leaf node) for tree hierarchies, and multiple paths for DAG hierarchies, by combining the predictions of every node in each possible path. In contrast with previous approaches, we evaluate all the paths, training local classifiers for each non-leaf node. The approach incorporates two contributions; (i) a cost is assigned to each node depending on the level it has in the hierarchy, giving more weight to correct predic- tions at the top levels; (ii) the relations between the nodes in the hierarchy are considered, by incorporating the parent label as in chained classifiers. The proposed approach was experimentally evaluated with 10 tree and 8 DAG hierarchi- cal datasets in the domain of protein function prediction. It was contrasted with various state-of-the-art hierarchical clas- sifiers using four common evaluation measures. The results show that our method is superior in almost all measures, and this difference is more significant in the case of DAG struc- tures.

A Hybrid Global-Local Approach for Hierarchical Classification

AAAI Conferences

Hierarchical classification is a variant of multidimensional classification where the classes are arranged in a hierarchy and the objective is to predict a class, or set of classes, according to a taxonomy. Different alternatives have been proposed for hierarchical classification, including local and global approaches. Local approaches are prone to suffer the inconsistency problem, while the global approaches tend to produce more complex models. In this paper, we propose a hybrid globallocal approach inspired on multidimensional classification. It starts by building a local multi-class classifier per each parent node in the hierarchy. In the classification phase all the local classifiers are applied simultaneously to each instance resulting in a most probable class for each classifier. A set of consistent classes are obtained, according to the hierarchy, based on three novel alternatives. The proposed method was tested on three different hierarchical classification data sets and was compared against state-of-the-art methods, resulting in significantly superior performance to the traditional topdown techniques; with competitive results against more complex top-down classifier selection methods.