An Investigation of Data Poisoning Defenses for Online Learning

Wang, Yizhen, Chaudhuri, Kamalika

arXiv.org Machine Learning 

Machine learning is increasingly used in safety-critical applications, and hence designing machine learning algorithms in the presence of an adversary has been a topic of active research [2, 3, 4, 5, 11, 12, 13]. A style of adversary that is commonly studied is data poisoning attacks [4, 12, 15, 21] where the adversary can modify or corrupt a small fraction of training examples with the goal of forcing the trained classifier to have low classification accuracy. Such attacks have threatened many real-world applications including spam filters [23], malware detection [25], sentiment analysis [24] and collaborative filtering [15]. There has been a body of prior work on data poisoning with increasingly sophisticated attacks and defenses [4, 12, 15, 21, 22, 27, 29, 30]. However, the literature largely suffers from two main limitations. First, most work is on the batch setting - all data is provided in advance and the adversary assumes that the learner's goal is to produce an empirical minimizer of a loss. This excludes many modern machine learning algorithms, such as, stochastic gradient descent, or learning from a data stream.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found