Self-Relation Attention and Temporal Awareness for Emotion Recognition via Vocal Burst
Trinh, Dang-Linh, Vo, Minh-Cong, Lee, Guee-Sang
–arXiv.org Artificial Intelligence
The technical report presents our emotion recognition pipeline for high-dimensional emotion task (A-VB High) in The ACII Affective Vocal Bursts (A-VB) 2022 Workshop \& Competition. Our proposed method contains three stages. Firstly, we extract the latent features from the raw audio signal and its Mel-spectrogram by self-supervised learning methods. Then, the features from the raw signal are fed to the self-relation attention and temporal awareness (SA-TA) module for learning the valuable information between these latent features. Finally, we concatenate all the features and utilize a fully-connected layer to predict each emotion's score. By empirical experiments, our proposed method achieves a mean concordance correlation coefficient (CCC) of 0.7295 on the test set, compared to 0.5686 on the baseline model. The code of our method is available at https://github.com/linhtd812/A-VB2022.
arXiv.org Artificial Intelligence
Sep-26-2022
- Country:
- South America > Venezuela (0.04)
- North America > United States (0.04)
- Asia > China (0.04)
- Africa > South Africa (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- Genre:
- Research Report (0.64)
- Technology: