Class-Aware Contrastive Optimization for Imbalanced Text Classification
Khvatskii, Grigorii, Moniz, Nuno, Doan, Khoa, Chawla, Nitesh V
–arXiv.org Artificial Intelligence
The unique characteristics of text data make classification tasks a complex problem. Advances in unsupervised and semi-supervised learning and autoencoder architectures addressed several challenges. However, they still struggle with imbalanced text classification tasks, a common scenario in real-world applications, demonstrating a tendency to produce embeddings with unfavorable properties, such as class overlap. In this paper, we show that leveraging class-aware contrastive optimization combined with denoising autoencoders can successfully tackle imbalanced text classification tasks, achieving better performance than the current state-of-the-art. Concretely, our proposal combines reconstruction loss with contrastive class separation in the embedding space, allowing a better balance between the truthfulness of the generated embeddings and the model's ability to separate different classes. Compared with an extensive set of traditional and state-of-the-art competing methods, our proposal demonstrates a notable increase in performance across a wide variety of text datasets.
arXiv.org Artificial Intelligence
Oct-29-2024
- Country:
- Asia
- China > Guangdong Province
- Guangzhou (0.04)
- India > Gujarat (0.04)
- Middle East > Jordan (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Vietnam > Hanoi
- Hanoi (0.04)
- China > Guangdong Province
- Europe
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- France > Grand Est
- Bas-Rhin > Strasbourg (0.04)
- Germany > Berlin (0.04)
- Italy > Tuscany
- Pisa Province > Pisa (0.04)
- Monaco (0.04)
- Croatia > Dubrovnik-Neretva County
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- California
- San Mateo County > San Mateo (0.04)
- Santa Clara County > Mountain View (0.04)
- Indiana > St. Joseph County
- Notre Dame (0.04)
- New York > New York County
- New York City (0.04)
- California
- Canada > Ontario
- Asia
- Genre:
- Research Report (1.00)
- Technology: