Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

arXiv.org Machine Learning

Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical relations. In fact, linguistic ontologies are intrinsic hierarchies. Besides the labels, the conceptual relations between words can also form hierarchical structures. Thus it can be a challenge to learn mappings from the word space to the label space, and vice versa. We propose to model the word and label hierarchies by embedding them jointly in the hyperbolic space. The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures. A new hyperbolic interaction model (HyperIM) is designed to learn the label-aware document representations and make predictions for HMLC. Extensive experiments are conducted on three benchmark datasets. The results have demonstrated that the new model can realistically capture the complex data structures and further improve the performance for HMLC comparing with the state-of-the-art methods. To facilitate future research, our code is publicly available.


Chained Path Evaluation for Hierarchical Multi-Label Classification

AAAI Conferences

In this paper we propose a novel hierarchical multi-label clas- sification approach for tree and directed acyclic graph (DAG) hierarchies. The method predicts a single path (from the root to a leaf node) for tree hierarchies, and multiple paths for DAG hierarchies, by combining the predictions of every node in each possible path. In contrast with previous approaches, we evaluate all the paths, training local classifiers for each non-leaf node. The approach incorporates two contributions; (i) a cost is assigned to each node depending on the level it has in the hierarchy, giving more weight to correct predic- tions at the top levels; (ii) the relations between the nodes in the hierarchy are considered, by incorporating the parent label as in chained classifiers. The proposed approach was experimentally evaluated with 10 tree and 8 DAG hierarchi- cal datasets in the domain of protein function prediction. It was contrasted with various state-of-the-art hierarchical clas- sifiers using four common evaluation measures. The results show that our method is superior in almost all measures, and this difference is more significant in the case of DAG struc- tures.


Non-intrusive Load Monitoring via Multi-label Sparse Representation based Classification

arXiv.org Machine Learning

This work follows the approach of multi - label classification for non - intrusive load monitoring (NILM) . We modify the popu lar sparse representation based classification (SRC) approach (developed for single label classification) to solve multi - label classification problems. Results on benchmark REDD and Pecan Street dataset shows significant improvement over state - of - the - art t echniques with small volume of training data . N non - intrusive load monitoring (NILM) the technical goal is to estimate the power consumption of different appliances given the aggregate smart - meter readings [1] . The broader social objective is to feedback this information to the household so that they can reduce power consumption and thereby save energy.


Multi-Label Graph Convolutional Network Representation Learning

arXiv.org Machine Learning

--Knowledge representation of graph-based systems is fundamental across many disciplines. T o date, most existing methods for representation learning primarily focus on networks with simplex labels, yet real-world objects (nodes) are inherently complex in nature and often contain rich semantics or labels, e . The multi-label network nodes not only have multiple labels for each node, such labels are often highly correlated making existing methods ineffective or fail to handle such correlation for node representation learning. In this paper, we propose a novel multi-label graph convolutional network (ML-GCN) for learning node representation for multi-label networks. T o fully explore label-label correlation and network topology structures, we propose to model a multi-label network as two Siamese GCNs: a node-node-label graph and a label-label-node graph. The two GCNs each handle one aspect of representation learning for nodes and labels, respectively, and they are seamlessly integrated under one objective function. The learned label representations can effectively preserve the inner-label interaction and node label properties, and are then aggregated to enhance the node representation learning under a unified training framework. Experiments and comparisons on multi-label node classification validate the effectiveness of our proposed approach. Graphs have become increasingly common structures for organizing data in many complex systems such as sensor networks, citation networks, social networks and many more [1]. Such a development raised new requirement of efficient network representation or embedding learning algorithms for various real-world applications, which seeks to learn low-dimensional vector representations of all nodes with preserved graph topology structures, such as edge links, degrees, and communities etc.


A no-regret generalization of hierarchical softmax to extreme multi-label classification

Neural Information Processing Systems

Extreme multi-label classification (XMLC) is a problem of tagging an instance with a small subset of relevant labels chosen from an extremely large pool of possible labels. Large label spaces can be efficiently handled by organizing labels as a tree, like in the hierarchical softmax (HSM) approach commonly used for multi-class problems. In this paper, we investigate probabilistic label trees (PLTs) that have been recently devised for tackling XMLC problems. We show that PLTs are a no-regret multi-label generalization of HSM when precision@$k$ is used as a model evaluation metric. Critically, we prove that pick-one-label heuristic---a reduction technique from multi-label to multi-class that is routinely used along with HSM---is not consistent in general. We also show that our implementation of PLTs, referred to as extremeText (XT), obtains significantly better results than HSM with the pick-one-label heuristic and XML-CNN, a deep network specifically designed for XMLC problems. Moreover, XT is competitive to many state-of-the-art approaches in terms of statistical performance, model size and prediction time which makes it amenable to deploy in an online system.