Physically realistic attacks on deep reinforcement learning
Deep reinforcement learning (RL) has achieved superhuman performance in problems ranging from data center cooling to video games. RL policies may soon be widely deployed, with research underway in autonomous driving, negotiation and automated trading. Many potential applications are safety-critical: automated trading failures caused Knight Capital to lose USD 460M, while faulty autonomous vehicles have resulted in loss of life. Consequently, it is critical that RL policies are robust: both to naturally occurring distribution shift, and to malicious attacks by adversaries. Unfortunately, we find that RL policies which perform at a high-level in normal situations can harbor serious vulnerabilities which can be exploited by an adversary. Prior work has shown deep RL policies are vulnerable to small adversarial perturbations to their observations, similar to adversarial examples in image classifiers.
May-21-2020, 12:59:55 GMT