BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack

Vo, Viet Quoc, Abbasnejad, Ehsan, Ranasinghe, Damith C.

arXiv.org Artificial Intelligence 

We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries. But, in contrast to query-based dense attack counterparts against black-box models, constructing sparse adversarial perturbations, even when models serve confidence score information to queries in a score-based setting, is non-trivial. Because, such an attack leads to: i) an NP-hard problem; and ii) a non-differentiable search space. Our artifacts and DIY attack samples are available on GitHub. Importantly, our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems. We are amidst an increasing prevalence of deep neural networks in real-world systems. So, our ability to understand the safety and security of neural networks is critical to our trust in machine intelligence. We have heightened awareness of adversarial attacks (Szegedy et al., 2014)--crafting imperceptible perturbations in inputs to manipulate deep perception systems to produce erroneous decisions. Only, access to model decisions (labels) or confidence scores are possible. Thus, crafting adversarial examples in black-box query-based interactions with a model is both interesting and practical to consider. Since confidence scores expose more information compared to model decisions, we can expect fewer queries to elicit effective attacks and, consequently, the potential for developing attacks at scale under score-based settings. While dense attacks are widely explored, the success of sparse-attacks, especially under score-based settings, has drawn much less attention and remains less understood (Croce et al., 2022). This leads to our lack of knowledge of model vulnerabilities to sparse perturbation regimes. Why are Score-Based Sparse Attacks Hard? An image with ground-truth label Minibus is misclassified as a Warplane.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found