AITopics | aggressive sampling

Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

Neural Information Processing SystemsNov-21-2025, 14:23:37 GMT

We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs of examples. The aim of the sampling strategy is to overcome the curse of long-tailed class distributions exhibited in majority of large-scale multi-class classification problems and to reduce the number of pairs of examples in the expanded data. We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with respect to state-of-the-art approaches.

aggressive sampling, binary reduction, multi-class, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

Neural Information Processing SystemsOct-7-2024, 14:11:28 GMT

Summary: This paper proposes a new reduction from multi-class classification to binary classification that is especially suitable when the number of classes is very large. They consider a hypothesis that map (input,class) pairs to scores, and the underlying loss function counts the fraction of the wrong classes that are scored higher than the true class. More specifically, they suppose they have a feature transformation phi that maps (input,class) pairs to a p-dimensional feature space, and they learn a mapping from R p to scores. Their reduction extends the work of Joshi et al. (2015) which, for each data point (x,y), creates K-1 transformed points where each transformed point intuitively corresponds to the comparison of label y with some incorrect label y'. Given that the transformed dataset contains correlated training examples, many standard generalization bounds cannot be applied.

generalization, reduction, text classification, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.58)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)

Add feedback

Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

Joshi, Bikash, Amini, Massih R., Partalas, Ioannis, Iutzeler, Franck, Maximov, Yury

Neural Information Processing SystemsFeb-14-2020, 14:44:20 GMT

We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs of examples. The aim of the sampling strategy is to overcome the curse of long-tailed class distributions exhibited in majority of large-scale multi-class classification problems and to reduce the number of pairs of examples in the expanded data. We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with respect to state-of-the-art approaches.

aggressive sampling, binary reduction, text classification, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.40)

Add feedback

Filters

Collaborating Authors

aggressive sampling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

Reviews: Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification