Multi-label Emotion Classification with PyTorch + HuggingFace's Transformers and W&B for Tracking
The GoEmotions dataset contains 58k carefully curated Reddit comments labeled for 27 emotion categories or Neutral. The raw data is included as well as the smaller, simplified version of the dataset with predefined train/val/test splits. After going through a few examples in this dataset on their visualizer, I realized that this is an extremely crucial dataset because it's rare to find sentiment classifier datasets that go beyond 5–6 emotions. But here, we have 27 emotions being assigned, with rare and close enough emotions like disappointment, disapproval, grief, remorse, sadness, etc. Detecting such close enough emotions is often difficult in typical datasets. This made it clear to me that this is an excellent dataset that can be scaled for usage in many applications that involve text analysis.
Aug-19-2021, 13:05:48 GMT
- Technology: