Principal Eigenvalue Regularization for Improved Worst-Class Certified Robustness of Smoothed Classifiers
Jin, Gaojie, Huang, Tianjin, Mu, Ronghui, Huang, Xiaowei
–arXiv.org Artificial Intelligence
Recent studies have identified a critical challenge in deep neural networks (DNNs) known as ``robust fairness", where models exhibit significant disparities in robust accuracy across different classes. While prior work has attempted to address this issue in adversarial robustness, the study of worst-class certified robustness for smoothed classifiers remains unexplored. Our work bridges this gap by developing a PAC-Bayesian bound for the worst-class error of smoothed classifiers. Through theoretical analysis, we demonstrate that the largest eigenvalue of the smoothed confusion matrix fundamentally influences the worst-class error of smoothed classifiers. Based on this insight, we introduce a regularization method that optimizes the largest eigenvalue of smoothed confusion matrix to enhance worst-class accuracy of the smoothed classifier and further improve its worst-class certified robustness. We provide extensive experimental validation across multiple datasets and model architectures to demonstrate the effectiveness of our approach.
arXiv.org Artificial Intelligence
Mar-21-2025
- Country:
- Asia (0.14)
- North America > United States (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (0.46)
- Technology: