Multi-label classification is an important learning problem with many applications. In this work, we propose a principled similarity-based approach for multi-label learning called SML. We also introduce a similarity-based approach for predicting the label set size. The experimental results demonstrate the effectiveness of SML for multi-label classification where it is shown to compare favorably with a wide variety of existing algorithms across a range of evaluation criterion.
In recent years, multi-label classification problem has become a controversial issue. In this kind of classification, each sample is associated with a set of class labels. Ensemble approaches are supervised learning algorithms in which an operator takes a number of learning algorithms, namely base-level algorithms and combines their outcomes to make an estimation. The simplest form of ensemble learning is to train the base-level algorithms on random subsets of data and then let them vote for the most popular classifications or average the predictions of the base-level algorithms. In this study, an ensemble learning method is proposed for improving multi-label classification evaluation criteria. We have compared our method with well-known base-level algorithms on some data sets. Experiment results show the proposed approach outperforms the base well-known classifiers for the multi-label classification problem.
Ramírez-Corona, Mallinali (Instituto Nacional de Astrofísica Óptica y Electrónica) | Sucar, L. Enrique (Instituto Nacional de Astrofísica Óptica y Electrónica) | Morales, Eduardo F. (Instituto Nacional de Astrofísica Óptica y Electrónica)
In this paper we propose a novel hierarchical multi-label clas- sification approach for tree and directed acyclic graph (DAG) hierarchies. The method predicts a single path (from the root to a leaf node) for tree hierarchies, and multiple paths for DAG hierarchies, by combining the predictions of every node in each possible path. In contrast with previous approaches, we evaluate all the paths, training local classifiers for each non-leaf node. The approach incorporates two contributions; (i) a cost is assigned to each node depending on the level it has in the hierarchy, giving more weight to correct predic- tions at the top levels; (ii) the relations between the nodes in the hierarchy are considered, by incorporating the parent label as in chained classifiers. The proposed approach was experimentally evaluated with 10 tree and 8 DAG hierarchi- cal datasets in the domain of protein function prediction. It was contrasted with various state-of-the-art hierarchical clas- sifiers using four common evaluation measures. The results show that our method is superior in almost all measures, and this difference is more significant in the case of DAG struc- tures.
Multi-label classifications exist in many real world applications. This paper empirically studies the performance of a variety of multi-label classification algorithms. Some of them are developed based on problem transformation. Some of them are developed based on adaption. Our experimental results show that the adaptive Multi-Label K-Nearest Neighbor performs the best, followed by Random k-Label Set, followed by Classifier Chain and Binary Relevance. Adaboost.MH performs the worst, followed by Pruned Problem Transformation. Our experimental results also provide us the confidence of the correlations among multi-labels. These insights shed light for future research directions on multi-label classifications.
Smirnov, Evgueni (DKE, Maastricht University) | Zhang, Hua ( DKE, Maastricht University ) | Peeters, Ralf (DKE, Maastricht University) | Nikolaev, Nikolay (London University) | Imkamp, Maike (Maastricht University )
This paper introduces a multi-label classification problem to the field of human computation. The problem involves training data such that each instance belongs to a set of classes. The true class sets of all the instances are provided together with their estimations presented by m human experts. Given the training data and the class-set estimates of the m experts for a new instance, the multi-label classification problem is to estimate the true class set of that instance. To solve the problem we propose an ensemble approach. Experiments show that the approach can outperform the best expert and the majority vote of the experts.