An Investigation of Data Poisoning Defenses for Online Learning
Wang, Yizhen, Chaudhuri, Kamalika
Machine learning is increasingly used in safety-critical applications, and hence designing machine learning algorithms in the presence of an adversary has been a topic of active research [2, 3, 4, 5, 11, 12, 13]. A style of adversary that is commonly studied is data poisoning attacks [4, 12, 15, 21] where the adversary can modify or corrupt a small fraction of training examples with the goal of forcing the trained classifier to have low classification accuracy. Such attacks have threatened many real-world applications including spam filters [23], malware detection [25], sentiment analysis [24] and collaborative filtering [15]. There has been a body of prior work on data poisoning with increasingly sophisticated attacks and defenses [4, 12, 15, 21, 22, 27, 29, 30]. However, the literature largely suffers from two main limitations. First, most work is on the batch setting - all data is provided in advance and the adversary assumes that the learner's goal is to produce an empirical minimizer of a loss. This excludes many modern machine learning algorithms, such as, stochastic gradient descent, or learning from a data stream.
May-28-2019
- Country:
- Europe > Middle East
- Malta (0.14)
- North America > United States
- California (0.14)
- Oregon (0.14)
- Europe > Middle East
- Genre:
- Research Report (1.00)
- Industry:
- Education > Educational Setting
- Online (0.41)
- Information Technology > Security & Privacy (1.00)
- Education > Educational Setting
- Technology: