Principal Eigenvalue Regularization for Improved Worst-Class Certified Robustness of Smoothed Classifiers

Jin, Gaojie, Huang, Tianjin, Mu, Ronghui, Huang, Xiaowei

Mar-21-2025–arXiv.org Artificial Intelligence

Recent studies have identified a critical challenge in deep neural networks (DNNs) known as ``robust fairness", where models exhibit significant disparities in robust accuracy across different classes. While prior work has attempted to address this issue in adversarial robustness, the study of worst-class certified robustness for smoothed classifiers remains unexplored. Our work bridges this gap by developing a PAC-Bayesian bound for the worst-class error of smoothed classifiers. Through theoretical analysis, we demonstrate that the largest eigenvalue of the smoothed confusion matrix fundamentally influences the worst-class error of smoothed classifiers. Based on this insight, we introduce a regularization method that optimizes the largest eigenvalue of smoothed confusion matrix to enhance worst-class accuracy of the smoothed classifier and further improve its worst-class certified robustness. We provide extensive experimental validation across multiple datasets and model architectures to demonstrate the effectiveness of our approach.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Mar-21-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.14)
- North America > United States (0.14)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Representation & Reasoning (1.00)