Mi, Yuxi
Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data
DeAndres-Tame, Ivan, Tolosana, Ruben, Melzi, Pietro, Vera-Rodriguez, Ruben, Kim, Minchul, Rathgeb, Christian, Liu, Xiaoming, Gomez, Luis F., Morales, Aythami, Fierrez, Julian, Ortega-Garcia, Javier, Zhong, Zhizhou, Huang, Yuge, Mi, Yuxi, Ding, Shouhong, Zhou, Shuigeng, He, Shuai, Fu, Lingzhi, Cong, Heng, Zhang, Rongyu, Xiao, Zhihong, Smirnov, Evgeny, Pimenov, Anton, Grigorev, Aleksei, Timoshenko, Denis, Asfaw, Kaleb Mesfin, Low, Cheng Yaw, Liu, Hao, Wang, Chuyi, Zuo, Qing, He, Zhixiang, Shahreza, Hatef Otroshi, George, Anjith, Unnervik, Alexander, Rahimi, Parsa, Marcel, Sébastien, Neto, Pedro C., Huber, Marco, Kolf, Jan Niklas, Damer, Naser, Boutros, Fadi, Cardoso, Jaime S., Sequeira, Ana F., Atzori, Andrea, Fenu, Gianni, Marras, Mirko, Štruc, Vitomir, Yu, Jiang, Li, Zhangjie, Li, Jichun, Zhao, Weisong, Lei, Zhen, Zhu, Xiangyu, Zhang, Xiao-Yu, Biesseck, Bernardo, Vidal, Pedro, Coelho, Luiz, Granada, Roger, Menotti, David
Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.
Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data
DeAndres-Tame, Ivan, Tolosana, Ruben, Melzi, Pietro, Vera-Rodriguez, Ruben, Kim, Minchul, Rathgeb, Christian, Liu, Xiaoming, Morales, Aythami, Fierrez, Julian, Ortega-Garcia, Javier, Zhong, Zhizhou, Huang, Yuge, Mi, Yuxi, Ding, Shouhong, Zhou, Shuigeng, He, Shuai, Fu, Lingzhi, Cong, Heng, Zhang, Rongyu, Xiao, Zhihong, Smirnov, Evgeny, Pimenov, Anton, Grigorev, Aleksei, Timoshenko, Denis, Asfaw, Kaleb Mesfin, Low, Cheng Yaw, Liu, Hao, Wang, Chuyi, Zuo, Qing, He, Zhixiang, Shahreza, Hatef Otroshi, George, Anjith, Unnervik, Alexander, Rahimi, Parsa, Marcel, Sébastien, Neto, Pedro C., Huber, Marco, Kolf, Jan Niklas, Damer, Naser, Boutros, Fadi, Cardoso, Jaime S., Sequeira, Ana F., Atzori, Andrea, Fenu, Gianni, Marras, Mirko, Štruc, Vitomir, Yu, Jiang, Li, Zhangjie, Li, Jichun, Zhao, Weisong, Lei, Zhen, Zhu, Xiangyu, Zhang, Xiao-Yu, Biesseck, Bernardo, Vidal, Pedro, Coelho, Luiz, Granada, Roger, Menotti, David
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
Flexible Differentially Private Vertical Federated Learning with Adaptive Feature Embeddings
Mi, Yuxi, Liu, Hongquan, Xia, Yewei, Sun, Yiheng, Guan, Jihong, Zhou, Shuigeng
The emergence of vertical federated learning (VFL) has stimulated concerns about the imperfection in privacy protection, as shared feature embeddings may reveal sensitive information under privacy attacks. This paper studies the delicate equilibrium between data privacy and task utility goals of VFL under differential privacy (DP). To address the generality issue of prior arts, this paper advocates a flexible and generic approach that decouples the two goals and addresses them successively. Specifically, we initially derive a rigorous privacy guarantee by applying norm clipping on shared feature embeddings, which is applicable across various datasets and models. Subsequently, we demonstrate that task utility can be optimized via adaptive adjustments on the scale and distribution of feature embeddings in an accuracy-appreciative way, without compromising established DP mechanisms. We concretize our observation into the proposed VFL-AFE framework, which exhibits effectiveness against privacy attacks and the capacity to retain favorable task utility, as substantiated by extensive experiments.
ARIBA: Towards Accurate and Robust Identification of Backdoor Attacks in Federated Learning
Mi, Yuxi, Guan, Jihong, Zhou, Shuigeng
The distributed nature and privacy-preserving characteristics of federated learning make it prone to the threat of poisoning attacks, especially backdoor attacks, where the adversary implants backdoors to misguide the model on certain attacker-chosen sub-tasks. In this paper, we present a novel method ARIBA to accurately and robustly identify backdoor attacks in federated learning. By empirical study, we observe that backdoor attacks are discernible by the filters of CNN layers. Based on this finding, we employ unsupervised anomaly detection to evaluate the pre-processed filters and calculate an anomaly score for each client. We then identify the most suspicious clients according to their anomaly scores. Extensive experiments are conducted, which show that our method ARIBA can effectively and robustly defend against multiple state-of-the-art attacks without degrading model performance.