Boumaraf, Said
Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent
Liu, Weihua, Boumaraf, Said, Li, Jianwu, Lin, Chaochao, Liu, Xiabi, Niu, Lijuan, Werghi, Naoufel
Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this challenge, we propose a novel optimization method for training deep neural networks called structured natural gradient descent (SNGD). Theoretically, we demonstrate that optimizing the original network using NGD is equivalent to using fast gradient descent (GD) to optimize the reconstructed network with a structural transformation of the parameter matrix. Thereby, we decompose the calculation of the global Fisher information matrix into the efficient computation of local Fisher matrices via constructing local Fisher layers in the reconstructed network to speed up the training. Experimental results on various deep networks and datasets demonstrate that SNGD achieves faster convergence speed than NGD while retaining comparable solutions. Furthermore, our method outperforms traditional GDs in terms of efficiency and effectiveness. Thus, our proposed method has the potential to significantly improve the scalability and efficiency of NGD in deep learning applications. Our source code is available at https://github.com/Chaochao-Lin/SNGD.
BENet: A Cross-domain Robust Network for Detecting Face Forgeries via Bias Expansion and Latent-space Attention
Liu, Weihua, Qiu, Jianhua, Boumaraf, Said, lin, Chaochao, liyuan, Pan, Li, Lin, Bennamoun, Mohammed, Werghi, Naoufel
In response to the growing threat of deepfake technology, we introduce BENet, a Cross-Domain Robust Bias Expansion Network. BENet enhances the detection of fake faces by addressing limitations in current detectors related to variations across different types of fake face generation techniques, where ``cross-domain" refers to the diverse range of these deepfakes, each considered a separate domain. BENet's core feature is a bias expansion module based on autoencoders. This module maintains genuine facial features while enhancing differences in fake reconstructions, creating a reliable bias for detecting fake faces across various deepfake domains. We also introduce a Latent-Space Attention (LSA) module to capture inconsistencies related to fake faces at different scales, ensuring robust defense against advanced deepfake techniques. The enriched LSA feature maps are multiplied with the expanded bias to create a versatile feature space optimized for subtle forgeries detection. To improve its ability to detect fake faces from unknown sources, BENet integrates a cross-domain detector module that enhances recognition accuracy by verifying the facial domain during inference. We train our network end-to-end with a novel bias expansion loss, adopted for the first time, in face forgery detection. Extensive experiments covering both intra and cross-dataset demonstrate BENet's superiority over current state-of-the-art solutions.
Occlusion-Aware Deep Convolutional Neural Network via Homogeneous Tanh-transforms for Face Parsing
Qiua, Jianhua, Liu, Weihua, Lin, Chaochao, Li, Jiaojiao, Yu, Haoping, Boumaraf, Said
Face parsing infers a pixel-wise label map for each semantic facial component. Previous methods generally work well for uncovered faces, however, they overlook facial occlusion and ignore some contextual areas outside a single face, especially when facial occlusion has become a common situation during the COVID-19 epidemic. Inspired by the lighting phenomena in everyday life, where illumination from four distinct lamps provides a more uniform distribution than a single central light source, we propose a novel homogeneous tanh-transform for image preprocessing, which is made up of four tanh-transforms. These transforms fuse the central vision and the peripheral vision together. Our proposed method addresses the dilemma of face parsing under occlusion and compresses more information from the surrounding context. Based on homogeneous tanh-transforms, we propose an occlusion-aware convolutional neural network for occluded face parsing. It combines information in both Tanh-polar space and Tanh-Cartesian space, capable of enhancing receptive fields. Furthermore, we introduce an occlusion-aware loss to focus on the boundaries of occluded regions. The network is simple, flexible, and can be trained end-to-end. To facilitate future research of occluded face parsing, we also contribute a new cleaned face parsing dataset. This dataset is manually purified from several academic or industrial datasets, including CelebAMask-HQ, Short-video Face Parsing, and the Helen dataset, and will be made public. Experiments demonstrate that our method surpasses state-of-the-art methods in face parsing under occlusion.