Imbalanced Multi-label Classification for Business-related Text with Moderately Large Label Spaces
Arslan, Muhammad, Cruz, Christophe
–arXiv.org Artificial Intelligence
In this study, we compared the performance of four different methods for multi-label text classification using a specific imbalanced business dataset. The four methods we evaluated were fine-tuned BERT, Binary Relevance, Classifier Chains, and Label Powerset. The results show that fine-tuned BERT outperforms the other three methods by a significant margin, achieving high values of accuracy, F1-Score, Precision, and Recall. Binary Relevance also performs well on this dataset, while Classifier Chains and Label Powerset demonstrate relatively poor performance. These findings highlight the effectiveness of fine-tuned BERT for multi-label text classification tasks, and suggest that it may be a useful tool for businesses seeking to analyze complex and multifaceted texts.
arXiv.org Artificial Intelligence
Jun-12-2023