Explainable Learning: Implicit Generative Modelling during Training for Adversarial Robustness

Panda, Priyadarshini, Roy, Kaushik

arXiv.org Machine Learning 

We introduce Explainable Learning, ExL, an approach for training neural networks that are intrinsically robust to adversarial attacks. We find that the implicit generative modelling of random noise, during posterior maximization, improves a model's understanding of the data manifold furthering adversarial robustness. We prove our approach's efficacy and provide a simplistic visualization tool for understanding adversarial data, using Principal Component Analysis. Our analysis reveals that adversarial robustness, in general, manifests in models with higher variance along the high-ranked principal components. We show that models learnt with ExL perform remarkably well against a wide-range of black-box attacks.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found