Machine-learning models vulnerable to undetectable backdoors
Boffins from UC Berkeley, MIT, and the Institute for Advanced Study in the United States have devised techniques to implant undetectable backdoors in machine learning (ML) models. Their work suggests ML models developed by third parties fundamentally cannot be trusted. In a paper that's currently being reviewed – "Planting Undetectable Backdoors in Machine Learning Models" – Shafi Goldwasser, Michael Kim, Vinod Vaikuntanathan, and Or Zamir explain how a malicious individual creating a machine learning classifier – an algorithm that classifies data into categories (eg "spam" or "not spam") – can subvert the classifier in a way that's not evident. "On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation," the paper explains. "Importantly, without the appropriate'backdoor key,' the mechanism is hidden and cannot be detected by any computationally-bounded observer."
Apr-21-2022, 11:43:02 GMT