Dynamic Difficulty Awareness Training for Continuous Emotion Prediction
Zhang, Zixing, Han, Jing, Coutinho, Eduardo, Schuller, Björn
–arXiv.org Artificial Intelligence
Abstract--Time-continuous emotion prediction has become an increasingly compelling task in machine learning. Considerable efforts have been made to advance the performance of these systems. Nonetheless, the main focus has been the development of more sophisticated models and the incorporation of different expressive modalities (e. g., speech, face, and physiology). In this paper, motivated by the benefit of difficulty awareness in a human learning procedure, we propose a novel machine learning framework, namely, Dynamic Difficulty Awareness Training (DDAT), which sheds fresh light on the research - directly exploiting the difficulties in learning to boost the machine learning process. The DDAT framework consists of two stages: information retrieval and information exploitation. In the first stage, we make use of the reconstruction error of input features or the annotation uncertainty to estimate the difficulty of learning specific information. The obtained difficulty level is then used in tandem with original features to update the model input in a second learning stage with the expectation that the model can learn to focus on high difficulty regions of the learning process. We perform extensive experiments on a benchmark database (RECOLA) to evaluate the effectiveness of the proposed framework. The experimental results show that our approach outperforms related baselines as well as other well-established time-continuous emotion prediction systems, which suggests that dynamically integrating the difficulty information for neural networks can help enhance the learning process. Time-continuous emotion prediction systems have received widespread interest in the machine learning (ML) community over the past decade [1]-[3]. One of the main reasons for this interest is the fact that time-continuous emotion predictions can analyse subtle and complex affective states of humans over time and play a central role in smart conversational agents that aim to achieve a natural and intuitive interaction between humans and machines [2], [4]-[7]. Great efforts have been made in this field, and most of them can generally be classified into two strands. Z. Zhang is with GLAM - the Group on Language, Audio & Music, Imperial College London (UK).
arXiv.org Artificial Intelligence
Sep-26-2018
- Country:
- Asia (1.00)
- Europe (1.00)
- North America > United States
- California > San Francisco County > San Francisco (0.14)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Education > Educational Setting
- Higher Education (0.48)
- Health & Medicine (0.88)
- Education > Educational Setting
- Technology: