viewfool
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints
Recent studies have demonstrated that visual recognition models lack robustness to distribution shift. However, current work mainly considers model robustness to 2D image transformations, leaving viewpoint changes in the 3D world less explored. In general, viewpoint changes are prevalent in various real-world applications (e.g., autonomous driving), making it imperative to evaluate viewpoint robustness. In this paper, we propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models. By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints under an entropic regularizer, which helps to handle the fluctuations of the real camera pose and mitigate the reality gap between the real objects and their neural representations. Experiments validate that the common image classifiers are extremely vulnerable to the generated adversarial viewpoints, which also exhibit high cross-model transferability. Based on ViewFool, we introduce ImageNet-V, a new out-of-distribution dataset for benchmarking viewpoint robustness of image classifiers. Evaluation results on 40 classifiers with diverse architectures, objective functions, and data augmentations reveal a significant drop in model performance when tested on ImageNet-V, which provides a possibility to leverage ViewFool as an effective data augmentation strategy to improve viewpoint robustness.
A Proof of Eq . 6
In this section, we derive the gradients of the objective in Eq. As we mentioned in Sec. Note that each dimension of the random variable is independent, thus we only consider one dimension. A limitation of the dataset is the smaller size. The dataset does not contain all classes in ImageNet, especially those with deformable shapes (e.g., We provide the additional experimental results in this section.
- Leisure & Entertainment > Sports (0.47)
- Transportation > Ground > Road (0.31)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Oklahoma > Beaver County (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Leisure & Entertainment > Sports (0.47)
- Transportation > Ground > Road (0.31)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Oklahoma > Beaver County (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints
Recent studies have demonstrated that visual recognition models lack robustness to distribution shift. However, current work mainly considers model robustness to 2D image transformations, leaving viewpoint changes in the 3D world less explored. In general, viewpoint changes are prevalent in various real-world applications (e.g., autonomous driving), making it imperative to evaluate viewpoint robustness. In this paper, we propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models. By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints under an entropic regularizer, which helps to handle the fluctuations of the real camera pose and mitigate the reality gap between the real objects and their neural representations.
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints
Dong, Yinpeng, Ruan, Shouwei, Su, Hang, Kang, Caixin, Wei, Xingxing, Zhu, Jun
Recent studies have demonstrated that visual recognition models lack robustness to distribution shift. However, current work mainly considers model robustness to 2D image transformations, leaving viewpoint changes in the 3D world less explored. In general, viewpoint changes are prevalent in various real-world applications (e.g., autonomous driving), making it imperative to evaluate viewpoint robustness. In this paper, we propose a novel method called ViewFool to find adversarial viewpoints that mislead visual recognition models. By encoding real-world objects as neural radiance fields (NeRF), ViewFool characterizes a distribution of diverse adversarial viewpoints under an entropic regularizer, which helps to handle the fluctuations of the real camera pose and mitigate the reality gap between the real objects and their neural representations. Experiments validate that the common image classifiers are extremely vulnerable to the generated adversarial viewpoints, which also exhibit high cross-model transferability. Based on ViewFool, we introduce ImageNet-V, a new out-of-distribution dataset for benchmarking viewpoint robustness of image classifiers. Evaluation results on 40 classifiers with diverse architectures, objective functions, and data augmentations reveal a significant drop in model performance when tested on ImageNet-V, which provides a possibility to leverage ViewFool as an effective data augmentation strategy to improve viewpoint robustness.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Oklahoma > Beaver County (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Information Technology (0.88)
- Transportation > Ground > Road (0.67)