Attacks on Online Learners: a Teacher-Student Analysis

Jan-16-2025, 01:46:53 GMT–Neural Information Processing Systems

Machine learning models are famously vulnerable to adversarial attacks: small ad-hoc perturbations of the data that can catastrophically alter the model predictions. While a large literature has studied the case of test-time attacks on pre-trained models, the important case of attacks in an online learning setting has received little attention so far. In this work, we use a control-theoretical perspective to study the scenario where an attacker may perturb data labels to manipulate the learning dynamics of an online learner. We perform a theoretical analysis of the problem in a teacher-student setup, considering different attack strategies, and obtaining analytical results for the steady state of simple linear learners. These results enable us to prove that a discontinuous transition in the learner's accuracy occurs when the attack strength exceeds a critical threshold.

online learner, teacher-student analysis, theoretical analysis

Neural Information Processing Systems

Jan-16-2025, 01:46:53 GMT

Conferences Web Page

Add feedback

Industry:
- Information Technology > Security & Privacy (0.64)
- Government > Military (0.64)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (0.83)
  - Security & Privacy (0.64)