In this paper, we propose an approach for Facial Expressions Recognition (FER) based on a deep multi-facial patches aggregation network. Deep features are learned from facial patches using deep sub-networks and aggregated within one deep architecture for expression classification . Several problems may affect the performance of deep-learning based FER approaches, in particular, the small size of existing FER datasets which might not be sufficient to train large deep learning networks. Moreover, it is extremely time-consuming to collect and annotate a large number of facial images. To account for this, we propose two data augmentation techniques for facial expression generation to expand FER labeled training datasets. We evaluate the proposed framework on three FER datasets. Results show that the proposed approach achieves state-of-art FER deep learning approaches performance when the model is trained and tested on images from the same dataset. Moreover, the proposed data augmentation techniques improve the expression recognition rate, and thus can be a solution for training deep learning FER models using small datasets. The accuracy degrades significantly when testing for dataset bias.
Facial expression analysis based on machine learning requires large number of well-annotated data to reflect different changes in facial motion. Publicly available datasets truly help to accelerate research in this area by providing a benchmark resource, but all of these datasets, to the best of our knowledge, are limited to rough annotations for action units, including only their absence, presence, or a five-level intensity according to the Facial Action Coding System. To meet the need for videos labeled in great detail, we present a well-annotated dataset named FEAFA for Facial Expression Analysis and 3D Facial Animation. One hundred and twenty-two participants, including children, young adults and elderly people, were recorded in real-world conditions. In addition, 99,356 frames were manually labeled using Expression Quantitative Tool developed by us to quantify 9 symmetrical FACS action units, 10 asymmetrical (unilateral) FACS action units, 2 symmetrical FACS action descriptors and 2 asymmetrical FACS action descriptors, and each action unit or action descriptor is well-annotated with a floating point number between 0 and 1. To provide a baseline for use in future research, a benchmark for the regression of action unit values based on Convolutional Neural Networks are presented. We also demonstrate the potential of our FEAFA dataset for 3D facial animation. Almost all state-of-the-art algorithms for facial animation are achieved based on 3D face reconstruction. We hence propose a novel method that drives virtual characters only based on action unit value regression of the 2D video frames of source actors.
Emotional Intelligence in Human-Computer Interaction has attracted increasing attention from researchers in multidisciplinary research fields including psychology, computer vision, neuroscience, artificial intelligence, and related disciplines. Human prone to naturally interact with computers face-to-face. Human Expressions is an important key to better link human and computers. Thus, designing interfaces able to understand human expressions and emotions can improve Human-Computer Interaction (HCI) for better communication. In this paper, we investigate HCI via a deep multi-facial patches aggregation network for Face Expression Recognition (FER). Deep features are extracted from facial parts and aggregated for expression classification. Several problems may affect the performance of the proposed framework like the small size of FER datasets and the high number of parameters to learn. For That, two data augmentation techniques are proposed for facial expression generation to expand the labeled training. The proposed framework is evaluated on the extended Cohn-Konade dataset (CK+) and promising results are achieved.
Facial pain expression is an important modality for assessing pain, especially when the patient's verbal ability to communicate is impaired. The facial muscle-based action units (AUs), which are defined by the Facial Action Coding System (FACS), have been widely studied and are highly reliable as a method for detecting facial expressions (FE) including valid detection of pain. Unfortunately, FACS coding by humans is a very time-consuming task that makes its clinical use prohibitive. Significant progress on automated facial expression recognition (AFER) has led to its numerous successful applications in FACS-based affective computing problems. However, only a handful of studies have been reported on automated pain detection (APD), and its application in clinical settings is still far from a reality. In this paper, we review the progress in research that has contributed to automated pain detection, with focus on 1) the framework-level similarity between spontaneous AFER and APD problems; 2) the evolution of system design including the recent development of deep learning methods; 3) the strategies and considerations in developing a FACS-based pain detection framework from existing research; and 4) introduction of the most relevant databases that are available for AFER and APD studies. We attempt to present key considerations in extending a general AFER framework to an APD framework in clinical settings. In addition, the performance metrics are also highlighted in evaluating an AFER or an APD system.
Measurement of facial expressions is important for research and assessment psychiatry, neurology,and experimental psychology (Ekman, Huang, Sejnowski, & Hager, 1992), and has technological applications in consumer-friendly user interfaces, interactive videoand entertainment rating. The Facial Action Coding System (FACS) is a method for measuring facial expressions in terms of activity in the underlying facial muscles (Ekman & Friesen, 1978). We are exploring ways to automate FACS.