multi-class
Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification
We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs of examples. The aim of the sampling strategy is to overcome the curse of long-tailed class distributions exhibited in majority of large-scale multi-class classification problems and to reduce the number of pairs of examples in the expanded data. We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with respect to state-of-the-art approaches.
SPARSE Data, Rich Results: Few-Shot Semi-Supervised Learning via Class-Conditioned Image Translation
Manni, Guido, Lauretti, Clemente, Zollo, Loredana, Soda, Paolo
Deep learning has revolutionized medical imaging, but its effectiveness is severely limited by insufficient labeled training data. This paper introduces a novel GAN-based semi-supervised learning framework specifically designed for low labeled-data regimes, evaluated across settings with 5 to 50 labeled samples per class. Our approach integrates three specialized neural networks -- a generator for class-conditioned image translation, a discriminator for authenticity assessment and classification, and a dedicated classifier -- within a three-phase training framework. The method alternates between supervised training on limited labeled data and unsupervised learning that leverages abundant unlabeled images through image-to-image translation rather than generation from noise. We employ ensemble-based pseudo-labeling that combines confidence-weighted predictions from the discriminator and classifier with temporal consistency through exponential moving averaging, enabling reliable label estimation for unlabeled data. Comprehensive evaluation across eleven MedMNIST datasets demonstrates that our approach achieves statistically significant improvements over six state-of-the-art GAN-based semi-supervised methods, with particularly strong performance in the extreme 5-shot setting where the scarcity of labeled data is most challenging. The framework maintains its superiority across all evaluated settings (5, 10, 20, and 50 shots per class). Our approach offers a practical solution for medical imaging applications where annotation costs are prohibitive, enabling robust classification performance even with minimal labeled data. Code is available at https://github.com/GuidoManni/SPARSE.
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.68)
- Health & Medicine > Health Care Technology (0.69)
- Health & Medicine > Diagnostic Medicine > Imaging (0.56)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)
COOD: Combined out-of-distribution detection using multiple measures for anomaly & novel class detection in large-scale hierarchical classification
Hogeweg, L. E., Gangireddy, R., Brunink, D., Kalkman, V. J., Cornelissen, L., Kamminga, J. W.
High-performing out-of-distribution (OOD) detection, both anomaly and novel class, is an important prerequisite for the practical use of classification models. In this paper, we focus on the species recognition task in images concerned with large databases, a large number of fine-grained hierarchical classes, severe class imbalance, and varying image quality. We propose a framework for combining individual OOD measures into one combined OOD (COOD) measure using a supervised model. The individual measures are several existing state-of-the-art measures and several novel OOD measures developed with novel class detection and hierarchical class structure in mind. COOD was extensively evaluated on three large-scale (500k+ images) biodiversity datasets in the context of anomaly and novel class detection. We show that COOD outperforms individual, including state-of-the-art, OOD measures by a large margin in terms of TPR@1% FPR in the majority of experiments, e.g., improving detecting ImageNet images (OOD) from 54.3% to 85.4% for the iNaturalist 2018 dataset. SHAP (feature contribution) analysis shows that different individual OOD measures are essential for various tasks, indicating that multiple OOD measures and combinations are needed to generalize. Additionally, we show that explicitly considering ID images that are incorrectly classified for the original (species) recognition task is important for constructing high-performing OOD detection methods and for practical applicability. The framework can easily be extended or adapted to other tasks and media modalities.
- Europe > Norway (0.04)
- Europe > Sweden (0.04)
- Europe > Netherlands > South Holland > Leiden (0.04)
- (3 more...)
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
Long, Zijun, McCreadie, Richard, Imran, Muhammad
In times of emergency, crisis response agencies need to quickly and accurately assess the situation on the ground in order to deploy relevant services and resources. However, authorities often have to make decisions based on limited information, as data on affected regions can be scarce until local response services can provide first-hand reports. Fortunately, the widespread availability of smartphones with high-quality cameras has made citizen journalism through social media a valuable source of information for crisis responders. However, analyzing the large volume of images posted by citizens requires more time and effort than is typically available. To address this issue, this paper proposes the use of state-of-the-art deep neural models for automatic image classification/tagging, specifically by adapting transformer-based architectures for crisis image classification (CrisisViT). We leverage the new Incidents1M crisis image dataset to develop a range of new transformer-based image classification models. Through experimentation over the standard Crisis image benchmark dataset, we demonstrate that the CrisisViT models significantly outperform previous approaches in emergency type, image relevance, humanitarian category, and damage severity classification. Additionally, we show that the new Incidents1M dataset can further augment the CrisisViT models resulting in an additional 1.25% absolute accuracy gain.
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Nebraska > Douglas County > Omaha (0.05)
- (17 more...)
- Transportation > Air (0.46)
- Media > News (0.34)
Increase Information Transfer Rates in BCI by CSP Extension to Multi-class
Brain-Computer Interfaces (BCI) are an interesting emerging technology that is driven by the motivation to develop an effective communication in- terface translating human intentions into a control signal for devices like computers or neuroprostheses. If this can be done bypassing the usual hu- man output pathways like peripheral nerves and muscles it can ultimately become a valuable tool for paralyzed patients. Most activity in BCI re- search is devoted to finding suitable features and algorithms to increase information transfer rates (ITRs). The present paper studies the implica- tions of using more classes, e.g., left vs. right hand vs. foot, for operating a BCI. We contribute by (1) a theoretical study showing under some mild assumptions that it is practically not useful to employ more than three or four classes, (2) two extensions of the common spatial pattern (CSP) algorithm, one interestingly based on simultaneous diagonalization, and (3) controlled EEG experiments that underline our theoretical findings and show excellent improved ITRs.
MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis
Yang, Jiancheng, Shi, Rui, Ni, Bingbing
We present MedMNIST, a collection of 10 pre-processed medical open datasets. MedMNIST is standardized to perform classification tasks on lightweight 28x28 images, which requires no background knowledge. Covering the primary data modalities in medical image analysis, it is diverse on data scale (from 100 to 100,000) and tasks (binary/multi-class, ordinal regression and multi-label). MedMNIST could be used for educational purpose, rapid prototyping, multi-modal machine learning or AutoML in medical image analysis. Moreover, MedMNIST Classification Decathlon is designed to benchmark AutoML algorithms on all 10 datasets; We have compared several baseline methods, including open-source or commercial AutoML tools. The datasets, evaluation code and baseline methods for MedMNIST are publicly available at https://medmnist.github.io/.
Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification
Joshi, Bikash, Amini, Massih R., Partalas, Ioannis, Iutzeler, Franck, Maximov, Yury
We address the problem of multi-class classification in the case where the number of classes is very large. We propose a double sampling strategy on top of a multi-class to binary reduction strategy, which transforms the original multi-class problem into a binary classification problem over pairs of examples. The aim of the sampling strategy is to overcome the curse of long-tailed class distributions exhibited in majority of large-scale multi-class classification problems and to reduce the number of pairs of examples in the expanded data. We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with respect to state-of-the-art approaches.