Not enough data to create a plot.
Try a different view from the menu above.
Vechev, Martin
The Fundamental Limits of Interval Arithmetic for Neural Networks
Mirman, Matthew, Baader, Maximilian, Vechev, Martin
Interval analysis (or interval bound propagation, IBP) is a popular technique for verifying and training provably robust deep neural networks, a fundamental challenge in the area of reliable machine learning. However, despite substantial efforts, progress on addressing this key challenge has stagnated, calling into question whether interval arithmetic is a viable path forward. In this paper we present two fundamental results on the limitations of interval arithmetic for analyzing neural networks. Our main impossibility theorem states that for any neural network classifying just three points, there is a valid specification over these points that interval analysis can not prove. Further, in the restricted case of one-hidden-layer neural networks we show a stronger impossibility result: given any radius $\alpha < 1$, there is a set of $O(\alpha^{-1})$ points with robust radius $\alpha$, separated by distance $2$, that no one-hidden-layer network can be proven to classify robustly via interval analysis.
Latent Space Smoothing for Individually Fair Representations
Peychev, Momchil, Ruoss, Anian, Balunović, Mislav, Baader, Maximilian, Vechev, Martin
Fair representation learning encodes user data to ensure fairness and utility, regardless of the downstream application. However, learning individually fair representations, i.e., guaranteeing that similar individuals are treated similarly, remains challenging in high-dimensional settings such as computer vision. In this work, we introduce LASSI, the first representation learning method for certifying individual fairness of high-dimensional data. Our key insight is to leverage recent advances in generative modeling to capture the set of similar individuals in the generative latent space. This allows learning individually fair representations where similar individuals are mapped close together, by using adversarial training to minimize the distance between their representations. Finally, we employ randomized smoothing to provably map similar individuals close together, in turn ensuring that local robustness verification of the downstream application results in end-to-end fairness certification. Our experimental evaluation on challenging real-world image data demonstrates that our method increases certified individual fairness by up to 60%, without significantly affecting task utility.
Effective Certification of Monotone Deep Equilibrium Models
Müller, Mark Niklas, Staab, Robin, Fischer, Marc, Vechev, Martin
Monotone Operator Equilibrium Models (monDEQs) represent a class of models combining the powerful deep equilibrium paradigm with convergence guarantees. Further, their inherent robustness to adversarial perturbations makes investigating their certifiability a promising research direction. Unfortunately, existing approaches are either imprecise or severely limited in scalability. In this work, we propose the first scalable and precise monDEQ verifier, based on two key ideas: (i) a novel convex relaxation enabling efficient inclusion checks, and (ii) non-trivial mathematical insights characterizing the fixpoint operations at the heart of monDEQs on sets rather than concrete inputs. An extensive evaluation of our verifier on the challenging $\ell_\infty$ perturbations demonstrates that it exceeds state-of-the-art performance in terms of speed (two orders of magnitude) and scalability (an order of magnitude) while yielding 25% higher certified accuracies on the same networks.
Boosting Randomized Smoothing with Variance Reduced Classifiers
Horváth, Miklós Z., Müller, Mark Niklas, Fischer, Marc, Vechev, Martin
Randomized Smoothing (RS) is a promising method for obtaining robustness certificates by evaluating a base model under noise. In this work we: (i) theoretically motivate why ensembles are a particularly suitable choice as base models for RS, and (ii) empirically confirm this choice, obtaining state of the art results in multiple settings. The key insight of our work is that the reduced variance of ensembles over the perturbations introduced in RS leads to significantly more consistent classifications for a given input, in turn leading to substantially increased certifiable radii for difficult samples. We also introduce key optimizations which enable an up to 50-fold decrease in sample complexity of RS, thus drastically reducing its computational overhead. Experimentally, we show that ensembles of only 3 to 10 classifiers consistently improve on the strongest single model with respect to their average certified radius (ACR) by 5% to 21% on both CIFAR-10 and ImageNet. On the latter, we achieve a state-of-the-art ACR of 1.11. We release all code and models required to reproduce our results upon publication.
Fair Normalizing Flows
Balunović, Mislav, Ruoss, Anian, Vechev, Martin
Fair representation learning is an attractive approach that promises fairness of downstream predictors by encoding sensitive data. Unfortunately, recent work has shown that strong adversarial predictors can still exhibit unfairness by recovering sensitive attributes from these representations. In this work, we present Fair Normalizing Flows (FNF), a new approach offering more rigorous fairness guarantees for learned representations. Specifically, we consider a practical setting where we can estimate the probability density for sensitive groups. The key idea is to model the encoder as a normalizing flow trained to minimize the statistical distance between the latent representations of different groups. The main advantage of FNF is that its exact likelihood computation allows us to obtain guarantees on the maximum unfairness of any potentially adversarial downstream predictor. We experimentally demonstrate the effectiveness of FNF in enforcing various group fairness notions, as well as other attractive properties such as interpretability and transfer learning, on a variety of challenging real-world datasets.
Robustness Certification for Point Cloud Models
Lorenz, Tobias, Ruoss, Anian, Balunović, Mislav, Singh, Gagandeep, Vechev, Martin
The use of deep 3D point cloud models in safety-critical applications, such as autonomous driving, dictates the need to certify the robustness of these models to semantic transformations. This is technically challenging as it requires a scalable verifier tailored to point cloud models that handles a wide range of semantic 3D transformations. In this work, we address this challenge and introduce 3DCertify, the first verifier able to certify robustness of point cloud models. 3DCertify is based on two key insights: (i) a generic relaxation based on first-order Taylor approximations, applicable to any differentiable transformation, and (ii) a precise relaxation for global feature pooling, which is more complex than pointwise activations (e.g., ReLU or sigmoid) but commonly employed in point cloud models. We demonstrate the effectiveness of 3DCertify by performing an extensive evaluation on a wide range of 3D transformations (e.g., rotation, twisting) for both classification and part segmentation tasks. For example, we can certify robustness against rotations by $\pm60^\circ$ for 95.7% of point clouds, and our max pool relaxation increases certification by up to 15.6%.
Precise Multi-Neuron Abstractions for Neural Network Certification
Müller, Mark Niklas, Makarchuk, Gleb, Singh, Gagandeep, Püschel, Markus, Vechev, Martin
Formal verification of neural networks is critical for their safe adoption in real-world applications. However, designing a verifier which can handle realistic networks in a precise manner remains an open and difficult challenge. In this paper, we take a major step in addressing this challenge and present a new framework, called PRIMA, that computes precise convex approximations of arbitrary non-linear activations. PRIMA is based on novel approximation algorithms that compute the convex hull of polytopes, leveraging concepts from computational geometry. The algorithms have polynomial complexity, yield fewer constraints, and minimize precision loss. We evaluate the effectiveness of PRIMA on challenging neural networks with ReLU, Sigmoid, and Tanh activations. Our results show that PRIMA is significantly more precise than the state-of-the-art, verifying robustness for up to 16%, 30%, and 34% more images than prior work on ReLU-, Sigmoid-, and Tanh-based networks, respectively.
Automated Discovery of Adaptive Attacks on Adversarial Defenses
Yao, Chengyuan, Bielik, Pavol, Tsankov, Petar, Vechev, Martin
To address this challenge, two recent works approach the problem from different perspectives. Tramer et al. (2020) Reliable evaluation of adversarial defenses is a outlines an approach for manually crafting adaptive attacks challenging task, currently limited to an expert that exploit the weak points of each defense. Here, a domain who manually crafts attacks that exploit the defense's expert starts with an existing attack, such as PGD (Madry inner workings, or to approaches based et al., 2018) (denoted as - in Figure 1), and adapts it based on on ensemble of fixed attacks, none of which may knowledge of the defense's inner workings. Common modifications be effective for the specific defense at hand. Our include: (i) tuning attack parameters (e.g., number key observation is that custom attacks are composed of steps), (ii) replacing network components to simplify the from a set of reusable building blocks, attack (e.g., removing randomization or non-differentiable such as fine-tuning relevant attack parameters, network components), and (iii) replacing the loss function optimized transformations, and custom loss functions.
Certified Defenses: Why Tighter Relaxations May Hurt Training?
Jovanović, Nikola, Balunović, Mislav, Baader, Maximilian, Vechev, Martin
Certified defenses based on convex relaxations are an established technique for training provably robust models. The key component is the choice of relaxation, varying from simple intervals to tight polyhedra. Paradoxically, however, it was empirically observed that training with tighter relaxations can worsen certified robustness. While several methods were designed to partially mitigate this issue, the underlying causes are poorly understood. In this work we investigate the above phenomenon and show that tightness may not be the determining factor for reduced certified robustness. Concretely, we identify two key features of relaxations that impact training dynamics: continuity and sensitivity. We then experimentally demonstrate that these two factors explain the drop in certified robustness when using popular relaxations. Further, we show, for the first time, that it is possible to successfully train with tighter relaxations (i.e., triangle), a result supported by our two properties. Overall, we believe the insights of this work can help drive the systematic discovery of new effective certified defenses.
Efficient Certification of Spatial Robustness
Ruoss, Anian, Baader, Maximilian, Balunović, Mislav, Vechev, Martin
Such spatial attacks can be modeled by smooth vector fields that describe the displacement of every pixel. Common geometric transformations, e.g., rotation and translation, are particular instances of these smooth vector fields, which indicates that they capture a wide range of naturally occurring image transformations. Since the vulnerability of neural networks to spatially transformed adversarial examples can pose a security threat to computer vision systems relying on such models, it is critical to quantify their robustness against spatial transformations. A common approach to estimate neural network robustness is to measure the success rate of strong attacks (Carlini, Wagner, 2017; Madry et al., 2018). However, many networks which are indeed robust against these attacks were later broken using even more sophisticated attacks (Athalye, Carlini, 2018; Athalye et al., 2018; Engstrom et al., 2018; Tramer et al., 2020). The key issue is that such attacks do not provide provable robustness guarantees.