FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification
Xia, Tingyu, Wang, Yue, Tian, Yuan, Chang, Yi
–arXiv.org Artificial Intelligence
Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords. Experiments on a wide range of classification tasks show that the proposed approach frequently outperforms keyword-driven models in terms of classification accuracy and often enjoys orders-of-magnitude faster training speed.
arXiv.org Artificial Intelligence
Dec-14-2022
- Country:
- North America > United States
- North Carolina (0.04)
- Asia > China
- Jilin Province (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Education (0.67)
- Technology: