Understanding and predicting latent emotions of users toward online contents, known as social emotion mining, has become increasingly important to both social platforms and businesses alike. Despite recent developments, however, very little attention has been made to the issues of nuance, subjectivity, and bias of social emotions. In this paper, we fill this gap by formulating social emotion mining as a robust label ranking problem, and propose: (1) a robust measure, named as G-mean-rank (GMR), which sets a formal criterion consistent with practical intuition; and (2) a simple yet effective label ranking model, named as ROAR, that is more robust toward unbalanced datasets (which are common). Through comprehensive empirical validation using 4 real datasets and 16 benchmark semi-synthetic label ranking datasets, and a case study, we demonstrate the superiorities of our proposals over 2 popular label ranking measures and 6 competing label ranking algorithms. The datasets and implementations used in the empirical validation are available for access.
We propose to solve a label ranking problem as a structured output regression task. We adopt a least square surrogate loss approach that solves a supervised learning problem in two steps: the regression step in a well-chosen feature space and the pre-image step. We use specific feature maps/embeddings for ranking data, which convert any ranking/permutation into a vector representation. These embeddings are all well-tailored for our approach, either by resulting in consistent estimators, or by solving trivially the pre-image problem which is often the bottleneck in structured prediction. We also propose their natural extension to the case of partial rankings and prove their efficiency on real-world datasets.
Multi-label learning methods assign multiple labels to one object. In practice, in addition to differentiating relevant labels from irrelevant ones, it is often desired to rank the relevant labels for an object, whereas the rankings of irrelevant labels are not important. Such a requirement, however, cannot be met because most existing methods were designed to optimize existing criteria, yet there is no criterion which encodes the aforementioned requirement. In this paper, we present a new criterion, Pro Loss, concerning the prediction on all labels as well as the rankings of only relevant labels. We then propose ProSVM which optimizes Pro Lossefficiently using alternating direction method of multipliers. We further improve its efficiency with an upper approximation that reduces the number of constraints from O ( T , 2 ) to O ( T ), where T is the number of labels. Experiments show that our proposals are not only superior on Pro Loss, but also highly competitive on existing evaluation criteria.
Label ranking is the task of inferring a total order over a predefined set of labels for each given instance. We present a general framework for batch learning of label ranking functions from supervised data. We assume that each instance in the training data is associated with a list of preferences over the label-set, however we do not assume that this list is either complete orconsistent. This enables us to accommodate a variety of ranking problems. In contrast to the general form of the supervision, our goal is to learn a ranking function that induces a total order over the entire set of labels. Special cases of our setting are multilabel categorization and hierarchical classification. We present a general boosting-based learning algorithm for the label ranking problem and prove a lower bound on the progress of each boosting iteration. The applicability of our approach is demonstrated with a set of experiments on a large-scale text corpus.
Label ranking aims to map instances to an order over a predefined set of labels. It is ideal that the label ranking model is trained by directly maximizing performance measures on training data. However, existing studies on label ranking models mainly based on the minimization of classification errors or rank losses. To fill in this gap in label ranking, in this paper a novel label ranking model is learned by minimizing a loss function directly defined on the performance measures. The proposed algorithm, referred to as BoostLR, employs a boosting framework and utilizes the rank aggregation technique to construct weak label rankers. Experimental results reveal the initial success of BoostLR.