Class-DisentanglementandApplicationsin AdversarialDetectionandDefense
–Neural Information Processing Systems
What is the minimum necessary information required by a neural netD() from an image x to accurately predict its class? Extracting such information in the input space fromx can allocate the areasD() mainly attending to and shed novel insights to the detection and defense of adversarial attacks. In this paper, we propose "class-disentanglement" that trains a variational autoencoder G() to extract this class-dependent information asx G(x) via a trade-off between reconstructingx by G(x) and classifying x by D(x G(x)), where the former competes with the latter in decomposingx so the latter retains only necessary information for classification inx G(x).
Neural Information Processing Systems
Feb-9-2026, 16:37:21 GMT
- Technology: