Affect Analysis in-the-wild: Valence-Arousal, Expressions, Action Units and a Unified Framework

Kollias, Dimitrios, Zafeiriou, Stefanos

arXiv.org Artificial Intelligence 

Affect recognition based on subjects' facial expressions has been a topic of major research in the attempt to generate machines that can understand the way subjects feel, act and react. In the past, due to the unavailability of large amounts of data captured in real-life situations, research has mainly focused on controlled environments. However, recently, social media and platforms have been widely used. Moreover, deep learning has emerged as a means to solve visual analysis and recognition problems. This paper exploits these advances and presents significant contributions for affect analysis and recognition in-the-wild. Affect analysis and recognition can be seen as a dual knowledge generation problem, involving: i) creation of new, large and rich in-the-wild databases and ii) design and training of novel deep neural architectures that are able to analyse affect over these databases and to successfully generalise their performance on other datasets. The paper focuses on large in-the-wild databases, i.e., Aff-Wild and Aff-Wild2 and presents the design of two classes of deep neural networks trained with these databases. The first class refers to uni-task affect recognition, focusing on prediction of the valence and arousal dimensional variables. The second class refers to estimation of all main behavior tasks, i.e. valence-arousal prediction; categorical emotion classification in seven basic facial expressions; facial Action Unit detection. A novel multi-task and holistic framework is presented which is able to jointly learn and effectively generalize and perform affect recognition over all existing in-the-wild databases. Large experimental studies illustrate the achieved performance improvement over the existing state-of-the-art in affect recognition. HIS paper presents recent developments and research directions in affective behavior analysis in-the-wild, which is a major targeted characteristic of human computer interaction systems in real life applications. Such systems, machines and robots, should be able to automatically sense and interpret facial and audio-visual signals relevant to emotions, appraisals and intentions; thus, being able to interact in a'human-centered' and engaging manner with people, as their digital assistants in the home, work, operational or industrial environment. Through human affect recognition, the reactions of the machine, or robot, will be consistent with people's expectations and emotions; their verbal and non-verbal interactions will be positively received by humans. Moreover, this interaction should not be dependent on the respective context, nor the human's age, sex, ethnicity, educational level, profession, or social position. As a consequence, the development of intelligent systems able to analyze human behavior in-the-wild can contribute to generation of trust, understanding and closeness between humans and machines in real life environments.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found