LIME: Weakly-Supervised Text Classification Without Seeds
–arXiv.org Artificial Intelligence
In weakly-supervised text classification, only label names act as sources of supervision. Predominant approaches to weakly-supervised text classification utilize a two-phase framework, where test samples are first assigned pseudo-labels and are then used to train a neural text classifier. In most previous work, the pseudo-labeling step is dependent on obtaining seed words that best capture the relevance of each class label. We present LIME, a framework for weakly-supervised text classification that entirely replaces the brittle seed-word generation process with entailment-based pseudo-classification. We find that combining weakly-supervised classification and textual entailment mitigates shortcomings of both, resulting in a more streamlined and effective classification pipeline. With just an off-the-shelf textual entailment model, LIME outperforms recent baselines in weakly-supervised text classification and achieves state-of-the-art in 4 benchmarks. We open source our code at https://github.com/seongminp/LIME.
arXiv.org Artificial Intelligence
Oct-13-2022
- Country:
- Europe > Netherlands (0.04)
- North America > United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > San Francisco County
- San Francisco (0.14)
- New York > New York County
- Asia
- China > Hong Kong (0.04)
- South Korea > Seoul
- Seoul (0.04)
- Genre:
- Research Report (0.64)
- Technology: