Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

arXiv.org Machine Learning

Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical relations. In fact, linguistic ontologies are intrinsic hierarchies. Besides the labels, the conceptual relations between words can also form hierarchical structures. Thus it can be a challenge to learn mappings from the word space to the label space, and vice versa. We propose to model the word and label hierarchies by embedding them jointly in the hyperbolic space. The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures. A new hyperbolic interaction model (HyperIM) is designed to learn the label-aware document representations and make predictions for HMLC. Extensive experiments are conducted on three benchmark datasets. The results have demonstrated that the new model can realistically capture the complex data structures and further improve the performance for HMLC comparing with the state-of-the-art methods. To facilitate future research, our code is publicly available.

Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

arXiv.org Machine Learning

CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. However, most existing deep models for multi-label text classification consider either the non-consecutive and long-distance semantics or the sequential semantics, but how to consider them both coherently is less studied. In addition, most existing methods treat output labels as independent methods, but ignore the hierarchical relations among them, leading to useful semantic information loss. In this paper, we propose a novel hierarchical taxonomy-aware and attentional graph capsule recurrent CNNs framework for large-scale multi-label text classification. Specifically, we first propose to model each document as a word order preserved graph-of-words and normalize it as a corresponding words-matrix representation which preserves both the non-consecutive, long-distance and local sequential semantics. Then the words-matrix is input to the proposed attentional graph capsule recurrent CNNs for more effectively learning the semantic features. To leverage the hierarchical relations among the class labels, we propose a hierarchical taxonomy embedding method to learn their representations, and define a novel weighted margin loss by incorporating the label representation similarity. Extensive evaluations on three datasets show that our model significantly improves the performance of large-scale multi-label text classification by comparing with state-of-the-art approaches.

Mandatory Leaf Node Prediction in Hierarchical Multilabel Classification

Neural Information Processing Systems

In hierarchical classification, the prediction paths may be required to always end at leaf nodes. This is called mandatory leaf node prediction (MLNP) and is particularly useful when the leaf nodes have much stronger semantic meaning than the internal nodes. However, while there have been a lot of MLNP methods in hierarchical multiclass classification, performing MLNP in hierarchical multilabel classification is much more difficult. In this paper, we propose a novel MLNP algorithm that (i) considers the global hierarchy structure; and (ii) can be used on hierarchies of both trees and DAGs. We show that one can efficiently maximize the joint posterior probability of all the node labels by a simple greedy algorithm. Moreover, this can be further extended to the minimization of the expected symmetric loss. Experiments are performed on a number of real-world data sets with tree- and DAG-structured label hierarchies. The proposed method consistently outperforms other hierarchical and flat multilabel classification methods.

Chained Path Evaluation for Hierarchical Multi-Label Classification

AAAI Conferences

In this paper we propose a novel hierarchical multi-label clas- sification approach for tree and directed acyclic graph (DAG) hierarchies. The method predicts a single path (from the root to a leaf node) for tree hierarchies, and multiple paths for DAG hierarchies, by combining the predictions of every node in each possible path. In contrast with previous approaches, we evaluate all the paths, training local classifiers for each non-leaf node. The approach incorporates two contributions; (i) a cost is assigned to each node depending on the level it has in the hierarchy, giving more weight to correct predic- tions at the top levels; (ii) the relations between the nodes in the hierarchy are considered, by incorporating the parent label as in chained classifiers. The proposed approach was experimentally evaluated with 10 tree and 8 DAG hierarchi- cal datasets in the domain of protein function prediction. It was contrasted with various state-of-the-art hierarchical clas- sifiers using four common evaluation measures. The results show that our method is superior in almost all measures, and this difference is more significant in the case of DAG struc- tures.

Multi-Label Graph Convolutional Network Representation Learning

arXiv.org Machine Learning

--Knowledge representation of graph-based systems is fundamental across many disciplines. T o date, most existing methods for representation learning primarily focus on networks with simplex labels, yet real-world objects (nodes) are inherently complex in nature and often contain rich semantics or labels, e . The multi-label network nodes not only have multiple labels for each node, such labels are often highly correlated making existing methods ineffective or fail to handle such correlation for node representation learning. In this paper, we propose a novel multi-label graph convolutional network (ML-GCN) for learning node representation for multi-label networks. T o fully explore label-label correlation and network topology structures, we propose to model a multi-label network as two Siamese GCNs: a node-node-label graph and a label-label-node graph. The two GCNs each handle one aspect of representation learning for nodes and labels, respectively, and they are seamlessly integrated under one objective function. The learned label representations can effectively preserve the inner-label interaction and node label properties, and are then aggregated to enhance the node representation learning under a unified training framework. Experiments and comparisons on multi-label node classification validate the effectiveness of our proposed approach. Graphs have become increasingly common structures for organizing data in many complex systems such as sensor networks, citation networks, social networks and many more [1]. Such a development raised new requirement of efficient network representation or embedding learning algorithms for various real-world applications, which seeks to learn low-dimensional vector representations of all nodes with preserved graph topology structures, such as edge links, degrees, and communities etc.