Mullins, Robert
Nudge Attacks on Point-Cloud DNNs
Zhao, Yiren, Shumailov, Ilia, Mullins, Robert, Anderson, Ross
The wide adaption of 3D point-cloud data in safety-critical applications such as autonomous driving makes adversarial samples a real threat. Existing adversarial attacks on point clouds achieve high success rates but modify a large number of points, which is usually difficult to do in real-life scenarios. In this paper, we explore a family of attacks that only perturb a few points of an input point cloud, and name them nudge attacks. We demonstrate that nudge attacks can successfully flip the results of modern point-cloud DNNs. We present two variants, gradient-based and decision-based, showing their effectiveness in white-box and grey-box scenarios. Our extensive experiments show nudge attacks are effective at generating both targeted and untargeted adversarial point clouds, by changing a few points or even a single point from the entire point-cloud input. We find that with a single point we can reliably thwart predictions in 12--80% of cases, whereas 10 points allow us to further increase this to 37--95%. Finally, we discuss the possible defenses against such attacks, and explore their limitations.
Sponge Examples: Energy-Latency Attacks on Neural Networks
Shumailov, Ilia, Zhao, Yiren, Bates, Daniel, Papernot, Nicolas, Mullins, Robert, Anderson, Ross
The high energy costs of neural network training and inference led to the use of acceleration hardware such as GPUs and TPUs. While this enabled us to train large-scale neural networks in datacenters and deploy them on edge devices, the focus so far is on average-case performance. In this work, we introduce a novel threat vector against neural networks whose energy consumption or decision latency are critical. We show how adversaries can exploit carefully crafted $\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to maximise energy consumption and latency. We mount two variants of this attack on established vision and language models, increasing energy consumption by a factor of 10 to 200. Our attacks can also be used to delay decisions where a network has critical real-time performance, such as in perception for autonomous vehicles. We demonstrate the portability of our malicious inputs across CPUs and a variety of hardware accelerator chips including GPUs, and an ASIC simulator. We conclude by proposing a defense strategy which mitigates our attack by shifting the analysis of energy consumption in hardware from an average-case to a worst-case perspective.
Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information
Zhao, Yiren, Shumailov, Ilia, Cui, Han, Gao, Xitong, Mullins, Robert, Anderson, Ross
Recent research on reinforcement learning has shown that trained agents are vulnerable to maliciously crafted adversarial samples. In this work, we show how adversarial samples against RL agents can be generalised from White-box and Grey-box attacks to a strong Black-box case, namely where the attacker has no knowledge of the agents and their training methods. We use sequence-to-sequence models to predict a single action or a sequence of future actions that a trained agent will make. Our approximation model, based on time-series information from the agent, successfully predicts agents' future actions with consistently above 80% accuracy on a wide range of games and training methods. Second, we find that although such adversarial samples are transferable, they do not outperform random Gaussian noise as a means of reducing the game scores of trained RL agents. This highlights a serious methodological deficiency in previous work on such agents; random jamming should have been taken as the baseline for evaluation. Third, we do find a novel use for adversarial samples in this context: they can be used to trigger a trained agent to misbehave after a specific delay. This appears to be a genuinely new type of attack; it potentially enables an attacker to use devices controlled by RL agents as time bombs.
Efficient and Effective Quantization for Sparse DNNs
Zhao, Yiren, Gao, Xitong, Bates, Daniel, Mullins, Robert, Xu, Cheng-Zhong
Deep convolutional neural networks (CNNs) are powerful tools for a wide range of vision tasks, but the enormous amount of memory and compute resources required by CNNs poses a challenge in deploying them on constrained devices. Existing compression techniques show promising performance in reducing the size and computation complexity of CNNs for efficient inference, but there lacks a method to integrate them effectively. In this paper, we attend to the statistical properties of sparse CNNs and present focused quantization, a novel quantization strategy based on powers-of-two values, which exploits the weight distributions after fine-grained pruning. The proposed method dynamically discovers the most effective numerical representation for weights in layers with varying sparsities, to minimize the impact of quantization on the task accuracy. Multiplications in quantized CNNs can be replaced with much cheaper bit-shift operations for efficient inference. Coupled with lossless encoding, we build a compression pipeline that provides CNNs high compression ratios (CR) and minimal loss in accuracies. In ResNet-50, we achieve a $ 18.08 \times $ CR with only $ 0.24\% $ loss in top-5 accuracy, outperforming existing compression pipelines.
Sitatapatra: Blocking the Transfer of Adversarial Samples
Shumailov, Ilia, Gao, Xitong, Zhao, Yiren, Mullins, Robert, Anderson, Ross, Xu, Cheng-Zhong
Convolutional Neural Networks (CNNs) are widely used to solve classification tasks in computer vision. However, they can be tricked into misclassifying specially crafted `adversarial' samples -- and samples built to trick one model often work alarmingly well against other models trained on the same task. In this paper we introduce Sitatapatra, a system designed to block the transfer of adversarial samples. It diversifies neural networks using a key, as in cryptography, and provides a mechanism for detecting attacks. What's more, when adversarial samples are detected they can typically be traced back to the individual device that was used to develop them. The run-time overheads are minimal permitting the use of Sitatapatra on constrained systems.
The Taboo Trap: Behavioural Detection of Adversarial Samples
Shumailov, Ilia, Zhao, Yiren, Mullins, Robert, Anderson, Ross
Deep Neural Networks (DNNs) have become a powerful tool for a wide range of problems. Yet recent work has shown an increasing variety of adversarial samples that can fool them. Most existing detection mechanisms impose significant costs, either by using additional classifiers to spot adversarial samples, or by requiring the DNN to be restructured. In this paper, we introduce a novel defence. We train our DNN so that, as long as it is working as intended on the kind of inputs we expect, its behavior is constrained, in that a set of behaviors are taboo. If it is exposed to adversarial samples, they will often cause a taboo behavior, which we can detect. As an analogy, we can imagine that we are teaching our robot good manners; if it's ever rude, we know it's come under some bad influence. This defence mechanism is very simple and, although it involves a modest increase in training, has almost zero computation overhead at runtime -- making it particularly suitable for use in embedded systems. Taboos can be both subtle and diverse. Just as humans' choice of language can convey a lot of information about location, affiliation, class and much else that can be opaque to outsiders but that enables members of the same group to recognise each other, so also taboo choice can encode and hide information. We can use this to make adversarial attacks much harder. It is a well-established design principle that the security of a system should not depend on the obscurity of its design, but of some variable (the key) which can differ between implementations and be changed as necessary. We explain how taboos can be used to equip a classifier with just such a key, and to tune the keying mechanism to adversaries of various capabilities. We evaluate the performance of a prototype against a wide range of attacks and show how our simple defense can work well in practice.