Do Outliers Ruin Collaboration?

Qiao, Mingda

arXiv.org Machine Learning 

Consider the following real-world scenario: we would like to train a speech recognition model based on labeled examples collected from different users. For this particular application, a high average accuracy over all users is far from satisfactory: a model that is correct on 99.9% of the data may still go seriously wrong for a small yet non-negligible 0.1% fraction of the users. Instead, a more desirable objective would be finding personalized speech recognition solutions that are accurate for every single user. There are two major challenges to achieving this goal, the first being user heterogeneity: a model trained exclusively for users with a particular accent may fail miserably for users from another region. This challenge hints that a successful learning algorithm should be adaptive: more samples need to be collected from users with atypical data distributions. Equally crucial is that a small fraction of the users are malicious (e.g., they are controlled by a competing corporation); these users intend to mislead the speech recognition model into generating inaccurate or even ludicrous outputs. Motivated by these practical concerns, we propose the Robust Collaborative Learning model and study from a theoretical perspective the complexity of learning in the presence of untrusted collaborators.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found