13th ACM Workshop on Artificial Intelligence and Security (AISec 2020)
A backdoor is a covert functionality in a machine learning model that causes it to produce incorrect outputs on inputs with a certain "trigger" feature. Recent research on data-poisoning and trojaning attacks has shown how backdoors can be introduced into ML models -- but only for backdoors that act as universal adversarial perturbations (UAPs) and in an inferior threat model that requires the attacker to poison the model and then modify the input at inference time. I will describe a new technique for backdooring ML models based on poisoning the loss-value computation, and demonstrate that it can introduce new types of backdoors which are different and more powerful than UAPs, including (1) single-pixel and physically realizable backdoors in ImageNet, (2) backdoors that switch the model to an entirely different, privacy-violating functionality, e.g., cause a model that counts the number of faces in a photo to covertly recognize specific individuals; and (3) semantic backdoors that do not require the attacker to modify the input at inference time. Oh, and they evade all known defenses, too.
Oct-19-2020, 09:14:26 GMT
- Technology: