Heterogeneous face recognition (HFR) refers to matching a probe face image taken from one modality to face images acquired from another modality. It plays an important role in security scenarios. However, HFR is still a challenging problem due to great discrepancies between cross-modality images. This paper proposes an asymmetric joint learning (AJL) approach to handle this issue. The proposed method transforms the cross-modality differences mutually by incorporating the synthesized images into the learning process which provides more discriminative information. Although the aggregated data would augment the scale of intra-classes, it also reduces the diversity (i.e. discriminative information) for inter-classes. Then, we develop the AJL model to balance this dilemma. Finally, we could obtain the similarity score between two heterogeneous face images through the log-likelihood ratio. Extensive experiments on viewed sketch database, forensic sketch database and near infrared image database illustrate that the proposed AJL-HFR method achieve superior performance in comparison to state-of-the-art methods.
In this work, we have proposed several enhancements to improve the performance of any facial emotion recognition (FER) system. We believe that the changes in the positions of the fiducial points and the intensities capture the crucial information regarding the emotion of a face image. We propose the use of the gradient and the Laplacian of the input image together with the original input into a convolutional neural network (CNN). These modifications help the network learn additional information from the gradient and Laplacian of the images. However, the plain CNN is not able to extract this information from the raw images. We have performed a number of experiments on two well known datasets KDEF and FERplus. Our approach enhances the already high performance of state-of-the-art FER systems by 3 to 5%.
Ensuring the security of transactions is currently one of the major challenges facing banking systems. The usage of face for biometric authentication of users is becoming adopted worldwide due its convenience and acceptability by people, and also given that, nowadays, almost all computers and mobile devices have built-in cameras. Such user authentication approach is attracting large investments from banking and financial institutions, especially in cross-domain scenarios, in which facial images from ID documents are compared with digital self-portraits (selfies) taken with the cameras of mobile devices, for the automated opening of new checking accounts or financial transactions authorization. In this work, besides of collecting a large cross-domain face database, with 27,002 real facial images of selfies and ID documents (13,501 subjects) captured from the systems of the major public Brazilian bank, we propose a novel approach for such cross-domain face matching based on deep features extracted by two well-referenced Convolutional Neural Networks (CNN). Results obtained on the large dataset collected, which we called FaceBank, with accuracy rates higher than 93%, demonstrate the robustness of the proposed approach to the cross-domain problem (comparing faces in IDs and selfies) and its feasible application in real banking security systems.
The quality and size of training set have great impact on the results of deep learning-based face related tasks. However, collecting and labeling adequate samples with high quality and balanced distributions still remains a laborious and expensive work, and various data augmentation techniques have thus been widely used to enrich the training dataset. In this paper, we systematically review the existing works of face data augmentation from the perspectives of the transformation types and methods, with the state-of-the-art approaches involved. Among all these approaches, we put the emphasis on the deep learning-based works, especially the generative adversarial networks which have been recognized as more powerful and effective tools in recent years. We present their principles, discuss the results and show their applications as well as limitations. Different evaluation metrics for evaluating these approaches are also introduced. We point out the challenges and opportunities in the field of face data augmentation, and provide brief yet insightful discussions.
Emotional Intelligence in Human-Computer Interaction has attracted increasing attention from researchers in multidisciplinary research fields including psychology, computer vision, neuroscience, artificial intelligence, and related disciplines. Human prone to naturally interact with computers face-to-face. Human Expressions is an important key to better link human and computers. Thus, designing interfaces able to understand human expressions and emotions can improve Human-Computer Interaction (HCI) for better communication. In this paper, we investigate HCI via a deep multi-facial patches aggregation network for Face Expression Recognition (FER). Deep features are extracted from facial parts and aggregated for expression classification. Several problems may affect the performance of the proposed framework like the small size of FER datasets and the high number of parameters to learn. For That, two data augmentation techniques are proposed for facial expression generation to expand the labeled training. The proposed framework is evaluated on the extended Cohn-Konade dataset (CK+) and promising results are achieved.