AITopics | labelset

Collaborating Authors

labelset

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Addressing Multilabel Imbalance with an Efficiency-Focused Approach Using Diffusion Model-Generated Synthetic Samples

Charte, Francisco, Dávila, Miguel Ángel, Pérez-Godoy, María Dolores, del Jesus, María José

arXiv.org Artificial IntelligenceJan-18-2025

Predictive models trained on imbalanced data tend to produce biased results. This problem is exacerbated when there is not just one output label, but a set of them. This is the case for multilabel learning (MLL) algorithms used to classify patterns, rank labels, or learn the distribution of outputs. Many solutions have been proposed in the literature. The one that can be applied universally, independent of the algorithm used to build the model, is data resampling. The generation of new instances associated with minority labels, so that empty areas of the feature space are filled, helps to improve the obtained models. The quality of these new instances depends on the algorithm used to generate them. In this paper, a diffusion model tailored to produce new instances for MLL data, called MLDM (\textit{MultiLabel Diffusion Model}), is proposed. Diffusion models have been mainly used to generate artificial images and videos. Our proposed MLDM is based on this type of models. The experiments conducted compare MLDM with several other MLL resampling algorithms. The results show that MLDM is competitive while it improves efficiency.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2501.10822

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Spain > Andalusia > Jaén Province > Jaén (0.04)
(2 more...)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

A Cross-Conformal Predictor for Multi-label Classification

Papadopoulos, Harris

arXiv.org Artificial IntelligenceNov-29-2022

Unlike the typical classification setting where each instance is associated with a single class, in multi-label learning each instance is associated with multiple classes simultaneously. Therefore the learning task in this setting is to predict the subset of classes to which each instance belongs. This work examines the application of a recently developed framework called Conformal Prediction (CP) to the multi-label learning setting. CP complements the predictions of machine learning algorithms with reliable measures of confidence. As a result the proposed approach instead of just predicting the most likely subset of classes for a new unseen instance, also indicates the likelihood of each predicted subset being correct. This additional information is especially valuable in the multi-label setting where the overall uncertainty is extremely high.

artificial intelligence, labelset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-662-44722-2_26

2211.16238

Country:

North America > United States > New York (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)

Genre: Research Report (0.67)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Estimating Multi-label Accuracy using Labelset Distributions

Park, Laurence A. F., Read, Jesse

arXiv.org Artificial IntelligenceSep-9-2022

A multi-label classifier estimates the binary label state (relevant vs irrelevant) for each of a set of concept labels, for any given instance. Probabilistic multi-label classifiers provide a predictive posterior distribution over all possible labelset combinations of such label states (the powerset of labels) from which we can provide the best estimate, simply by selecting the labelset corresponding to the largest expected accuracy, over that distribution. For example, in maximizing exact match accuracy, we provide the mode of the distribution. But how does this relate to the confidence we may have in such an estimate? Confidence is an important element of real-world applications of multi-label classifiers (as in machine learning in general) and is an important ingredient in explainability and interpretability. However, it is not obvious how to provide confidence in the multi-label context and relating to a particular accuracy metric, and nor is it clear how to provide a confidence which correlates well with the expected accuracy, which would be most valuable in real-world decision making. In this article we estimate the expected accuracy as a surrogate for confidence, for a given accuracy metric. We hypothesise that the expected accuracy can be estimated from the multi-label predictive distribution. We examine seven candidate functions for their ability to estimate expected accuracy from the predictive distribution. We found three of these to correlate to expected accuracy and are robust. Further, we determined that each candidate function can be used separately to estimate Hamming similarity, but a combination of the candidates was best for expected Jaccard index and exact match.

accuracy, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2209.04163

Country:

Oceania > Australia (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Add feedback

Multilabel Prototype Generation for Data Reduction in k-Nearest Neighbour classification

Valero-Mas, Jose J., Gallego, Antonio Javier, Alonso-Jiménez, Pablo, Serra, Xavier

arXiv.org Artificial IntelligenceJul-22-2022

Prototype Generation (PG) methods are typically considered for improving the efficiency of the $k$-Nearest Neighbour ($k$NN) classifier when tackling high-size corpora. Such approaches aim at generating a reduced version of the corpus without decreasing the classification performance when compared to the initial set. Despite their large application in multiclass scenarios, very few works have addressed the proposal of PG methods for the multilabel space. In this regard, this work presents the novel adaptation of four multiclass PG strategies to the multilabel case. These proposals are evaluated with three multilabel $k$NN-based classifiers, 12 corpora comprising a varied range of domains and corpus sizes, and different noise scenarios artificially induced in the data. The results obtained show that the proposed adaptations are capable of significantly improving -- both in terms of efficiency and classification performance -- the only reference multilabel PG work in the literature as well as the case in which no PG method is applied, also presenting a statistically superior robustness in noisy scenarios. Moreover, these novel PG strategies allow prioritising either the efficiency or efficacy criteria through its configuration depending on the target scenario, hence covering a wide area in the solution space not previously filled by other works.

artificial intelligence, machine learning, scenario, (18 more...)

arXiv.org Artificial Intelligence

2207.10947

Country:

Europe > Spain > Valencian Community > Alicante Province > Alicante (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry:

Media > Music (0.68)
Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.51)

Add feedback

Local Multi-Label Explanations for Random Forest

Mylonas, Nikolaos, Mollas, Ioannis, Bassiliades, Nick, Tsoumakas, Grigorios

arXiv.org Artificial IntelligenceJul-5-2022

Multi-label classification is a challenging task, particularly in domains where the number of labels to be predicted is large. Deep neural networks are often effective at multi-label classification of images and textual data. When dealing with tabular data, however, conventional machine learning algorithms, such as tree ensembles, appear to outperform competition. Random forest, being a popular ensemble algorithm, has found use in a wide range of real-world problems. Such problems include fraud detection in the financial domain, crime hotspot detection in the legal sector, and in the biomedical field, disease probability prediction when patient records are accessible. Since they have an impact on people's lives, these domains usually require decision-making systems to be explainable. Random Forest falls short on this property, especially when a large number of tree predictors are used. This issue was addressed in a recent research named LionForests, regarding single label classification and regression. In this work, we adapt this technique to multi-label classification problems, by employing three different strategies regarding the labels that the explanation covers. Finally, we provide a set of qualitative and quantitative experiments to assess the efficacy of this approach.

artificial intelligence, explanation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-23618-1_25

2207.01994

Country:

Europe > Greece > Central Macedonia > Thessaloniki (0.05)
Asia > Japan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Basic and Depression Specific Emotion Identification in Tweets: Multi-label Classification Experiments

Farruque, Nawshad, Huang, Chenyang, Zaiane, Osmar, Goebel, Randy

arXiv.org Artificial IntelligenceMay-26-2021

We choose our basic emotions from a hybrid emotion model consisting of the common emotions from four highly regarded psychological models of emotions. Moreover, we augment that emotion model with new emotion categories because of their importance in the analysis of depression. Most of those additional emotions have not been used in previous emotion mining research. Our experimental analyses show that a cost sensitive RankSVM algorithm and a Deep Learning model are both robust, measured by both Macro F-measures and Micro F-measures. This suggests that these algorithms are superior in addressing the widely known data imbalance problem in multi-label learning. Moreover, our application of Deep Learning performs the best, giving it an edge in modeling deep semantic features of our extended emotional categories.

classification, emotion, tweet, (15 more...)

arXiv.org Artificial Intelligence

2105.12364

Country:

North America > Canada > Alberta (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evolving Multi-label Classification Rules by Exploiting High-order Label Correlation

Nazmi, Shabnam, Yan, Xuyang, Homaifar, Abdollah, Doucette, Emily

arXiv.org Machine LearningJul-22-2020

In multi-label classification tasks, each problem instance is associated with multiple classes simultaneously. In such settings, the correlation between labels contains valuable information that can be used to obtain more accurate classification models. The correlation between labels can be exploited at different levels such as capturing the pair-wise correlation or exploiting the higher-order correlations. Even though the high-order approach is more capable of modeling the correlation, it is computationally more demanding and has scalability issues. This paper aims at exploiting the high-order label correlation within subsets of labels using a supervised learning classifier system (UCS). For this purpose, the label powerset (LP) strategy is employed and a prediction aggregation within the set of the relevant labels to an unseen instance is utilized to increase the prediction capability of the LP method in the presence of unseen labelsets. Exact match ratio and Hamming loss measures are considered to evaluate the rule performance and the expected fitness value of a classifier is investigated for both metrics. Also, a computational complexity analysis is provided for the proposed algorithm. The experimental results of the proposed method are compared with other well-known LP-based methods on multiple benchmark datasets and confirm the competitive performance of this method.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

2007.11609

Country:

Oceania > New Zealand (0.04)
North America > United States > North Carolina > Guilford County > Greensboro (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Alabama (0.04)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Using Correlation for Labelset Selection in Multi-Label Classification of Users Reactions

Curi, Zacarias (Pontifícia Universidade Católica do Paraná) | Jr, Alceu de Souza Britto (Pontifícia Universidade Católica do Paraná) | Paraiso, Emerson Cabrera

AAAI ConferencesMay-15-2019

The increasing use of social networks has made opinion mining an important field in the area of Natural Language Processing. The analysis of texts from the reader perspective tends to generate multi-label data since one can interpret the text using different contexts. In this paper, a new method for multi-label classification is proposed to identify reactions or emotions in texts. The new method uses data correlation to improve the class ensemble process used to create the classifiers. In addition to the new method, a new corpus of news written in Brazilian Portuguese labeled with user reactions is presented. Experiments performed with the new corpus and with two existing corpora have demonstrated that the proposed method generates statistically superior or equivalent results, requiring fewer classifiers or classes than traditional problem transformation methods.

classification, correlation, reaction, (15 more...)

AAAI Conferences

The Thirty-Second International Flairs Conference

Country:

South America > Brazil > Paraná > Curitiba (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > Oregon (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)

Add feedback

Nearest Labelset Using Double Distances for Multi-label Classification

Gweon, Hyukjun, Schonlau, Matthias, Steiner, Stefan

arXiv.org Machine LearningFeb-15-2017

Noname manuscript No. (will be inserted by the editor) Abstract Multi-label classification is a type of supervised learning where an instance may belong to multiple labels simultaneously. Predicting each label independently has been criticized for not exploiting any correlation between labels. In this paper we propose a novel approach, Nearest Labelset using Double Distances (NLDD), that predicts the labelset observed in the training data that minimizes a weighted sum of the distances in both the feature space and the label space to the new instance. The weights specify the relative tradeoff between the two distances. The weights are estimated from a binomial regression of the number of misclassified labels as a function of the two distances. Model parameters are estimated by maximum likelihood. NLDD only considers labelsets observed in the training data, thus implicitly taking into account label dependencies. Keywords Multi-label classification, Machine learning, Label correlations 1 Introduction In multi-label classification, an instance can belong to multiple labels at the same time. This is different from multi-class or binary classification, where an instance can only be associated with a single label.

artificial intelligence, labelset, machine learning, (19 more...)

arXiv.org Machine Learning

1702.04684

Country:

Europe > Austria (0.28)
North America > United States > New York (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Ensemble Methods for Multi-label Classification

Rokach, Lior, Schclar, Alon, Itach, Ehud

arXiv.org Machine LearningJul-6-2013

Ensemble methods have been shown to be an effective tool for solving multi-label classification tasks. In the RAndom k-labELsets (RAKEL) algorithm, each member of the ensemble is associated with a small randomly-selected subset of k labels. Then, a single label classifier is trained according to each combination of elements in the subset. In this paper we adopt a similar approach, however, instead of randomly choosing subsets, we select the minimum required subsets of k labels that cover all labels and meet additional constraints such as coverage of inter-label correlations. Construction of the cover is achieved by formulating the subset selection as a minimum set covering problem (SCP) and solving it by using approximation algorithms. Every cover needs only to be prepared once by offline algorithms. Once prepared, a cover may be applied to the classification of any given multi-label dataset whose properties conform with those of the cover. The contribution of this paper is two-fold. First, we introduce SCP as a general framework for constructing label covers while allowing the user to incorporate cover construction constraints. We demonstrate the effectiveness of this framework by proposing two construction constraints whose enforcement produces covers that improve the prediction performance of random selection. Second, we provide theoretical bounds that quantify the probabilities of random selection to produce covers that meet the proposed construction criteria. The experimental results indicate that the proposed methods improve multi-label classification accuracy and stability compared with the RAKEL algorithm and to other state-of-the-art algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1307.1769

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback