AITopics | rare label

Collaborating Authors

rare label

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Causal Invariance and Counterfactual Learning Driven Cooperative Game for Multi-Label Classification

Fan, Yijia, Zhang, Jusheng, Cai, Kaitong, Yang, Jing, Wang, Keze

arXiv.org Artificial IntelligenceDec-2-2025

Multi-label classification (MLC) remains vulnerable to label imbalance, spurious correlations, and distribution shifts, challenges that are particularly detrimental to rare label prediction. To address these limitations, we introduce the Causal Cooperative Game (CCG) framework, which conceptualizes MLC as a cooperative multi-player interaction. CCG unifies explicit causal discovery via Neural Structural Equation Models with a counterfactual curiosity reward to drive robust feature learning. Furthermore, it incorporates a causal invariance loss to ensure generalization across diverse environments, complemented by a specialized enhancement strategy for rare labels. Extensive benchmarking demonstrates that CCG substantially outperforms strong baselines in both rare label prediction and overall robustness. Through rigorous ablation studies and qualitative analysis, we validate the efficacy and interpretability of our components, underscoring the potential of synergizing causal inference with cooperative game theory for advancing multi-label learning.

artificial intelligence, machine learning, rare label, (18 more...)

arXiv.org Artificial Intelligence

2512.00812

Country: Asia (0.28)

Genre:

Overview (1.00)
Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

Rawat, Ankit Singh, Menon, Aditya Krishna, Jitkrittum, Wittawat, Jayasumana, Sadeep, Yu, Felix X., Reddi, Sashank, Kumar, Sanjiv

arXiv.org Machine LearningMay-12-2021

Classification problems with a large number of labels arise in language modelling [Mikolov et al., 2013, Levy and Goldberg, 2014], recommender systems [Covington et al., 2016, Xu et al., 2016], and information retrieval [Agrawal et al., 2013, Prabhu and Varma, 2014]. Such large-output problems pose a core challenge: losses such as the softmax cross-entropy can be prohibitive to optimise, as they depend on the entire set of labels. Several works have thus devised negative sampling schemes for efficiently and effectively approximating such losses [Bengio and Senecal, 2008, Blanc and Rendle, 2018, Ruiz et al., 2018, Bamler and Mandt, 2020]. Broadly, negative sampling techniques sample a subset of "negative" labels, which are used to contrast against the observed "positive" labels. One further applies a suitable weighting on these "negatives", which ostensibly corrects the sampling bias introduced by the dependence on a random subset of labels. Intuitively, such bias assesses how closely a scheme approximates the unsampled loss on the full label set. This bias is well understood for sampled softmax schemes (see, e.g., Bengio and Senecal [2008]); surprisingly, however, far less is understood about other popular schemes, e.g., within-batch and uniform sampling (cf.

balanced error, softmax cross-entropy, weighting, (17 more...)

arXiv.org Machine Learning

2105.05736

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.48)

Add feedback

Accelerating Extreme Classification via Adaptive Feature Agglomeration

Jalan, Ankit, Kar, Purushottam

arXiv.org Artificial IntelligenceMay-28-2019

Extreme classification seeks to assign each data point, the most relevant labels from a universe of a million or more labels. This task is faced with the dual challenge of high precision and scalability, with millisecond level prediction times being a benchmark. We propose DEFRAG, an adaptive feature agglomeration technique to accelerate extreme classification algorithms. Despite past works on feature clustering and selection, DEFRAG distinguishes itself in being able to scale to millions of features, and is especially beneficial when feature sets are sparse, which is typical of recommendation and multi-label datasets. The method comes with provable performance guarantees and performs efficient task-driven agglomeration to reduce feature dimensionalities by an order of magnitude or more. Experiments show that DEFRAG can not only reduce training and prediction times of several leading extreme classification algorithms by as much as 40%, but also be used for feature reconstruction to address the problem of missing features, as well as offer superior coverage on rare labels.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1905.11769

Country:

North America > United States (0.04)
Asia > India > Uttar Pradesh > Kanpur (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Thresholding Classifiers to Maximize F1 Score

Lipton, Zachary Chase, Elkan, Charles, Narayanaswamy, Balakrishnan

arXiv.org Machine LearningMay-13-2014

This paper provides new insight into maximizing F1 scores in the context of binary classification and also in the context of multilabel classification. The harmonic mean of precision and recall, F1 score is widely used to measure the success of a binary classifier when one class is rare. Micro average, macro average, and per instance average F1 scores are used in multilabel classification. For any classifier that produces a real-valued output, we derive the relationship between the best achievable F1 score and the decision-making threshold that achieves this optimum. As a special case, if the classifier outputs are well-calibrated conditional probabilities, then the optimal threshold is half the optimal F1 score. As another special case, if the classifier is completely uninformative, then the optimal behavior is to classify all examples as positive. Since the actual prevalence of positive examples typically is low, this behavior can be considered undesirable. As a case study, we discuss the results, which can be surprising, of applying this procedure when predicting 26,853 labels for Medline documents.

information retrieval, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1402.1892

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback