Certifying Deep Network Risks and Individual Predictions with PAC-Bayes Loss via Localized Priors

Neural Information Processing Systems 

As machine learning increasingly relies on large, opaque foundation models powering generative and agentic AI, deploying these systems in safety-critical contexts demands rigorous generalization guarantees beyond training data. PAC-Bayes theory provides principled certificates linking training performance to generalization risk, yet existing approaches remain impractical: simple theoretical priors yield vacuous bounds, while data-dependent priors require costly second-stage training or introduce bias. To bridge this critical gap, we propose a localized PAC-Bayes prior--a structured, computationally efficient prior softly concentrated around parameters favored during standard training. By integrating this localized prior directly into the standard training objective, we deliver practically tight generalization certificates with minimal workflow disruption. Under standard neural tangent kernel assumptions, our bound shrinks as networks widen and datasets grow, becoming negligible in realistic regimes. Empirically, we demonstrate tight generalization certificates on tasks ranging from image classification (MNIST, CIFAR, ImageNet) and NLP fine-tuning (GLUE) to semantic segmentation (Cityscapes), typically within three percentage points of test error at ImageNet scale. Additionally, our approach provides rigorous guarantees for individual predictions, selective rejection of uncertain predictions, adversarial robustness, and accurate calibration--directly addressing key requirements for trustworthy AI deployment.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found