Methods for addressing class imbalance in deep learning-based natural language processing

AIHub 

Figure 1: Modern Transformer-based Natural Language Processing (NLP) methods still struggle with class imbalance: class-wise performance (second row, each dot represents one class) decreases with class frequency in training data (first row) for a variety of NLP tasks. Natural Language Processing (NLP) tasks are often addressed by training supervised models using manually labeled datasets. This comes with the challenge that categories rarely occur with the exact same frequency; in practice, the distribution of samples across classes is usually highly skewed. In sentiment analysis, there may be a large number of negative reviews, with only a small number of positive reviews. Such class imbalance in the training and evaluation datasets can pose a challenge for NLP models, which are more heavily influenced by majority class data during training.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found