cit
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
- Asia > Russia (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
- North America > United States (0.04)
- Europe > Italy > Sicily (0.04)
- Asia > Middle East > Jordan (0.04)
Students Parrot Their Teachers: Membership Inference on Model Distillation Matthew Jagielski
Model distillation is frequently proposed as a technique to reduce the privacy leakage of machine learning. These empirical privacy defenses rely on the intuition that distilled "student" models protect the privacy of training data, as they only interact with this data indirectly through a "teacher" model. In this work, we design membership inference attacks to systematically study the privacy provided by knowledge distillation to both the teacher and student training sets. Our new attacks show that distillation alone provides only limited privacy across a number of domains. We explain the success of our attacks on distillation by showing that membership inference attacks on a private dataset can succeed even if the target model is never queried on any actual training points, but only on inputs whose predictions are highly influenced by training data. Finally, we show that our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Texas (0.05)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- (5 more...)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
- Africa (0.04)
Classification Imbalance as Transfer Learning
Xia, Eric, Klusowski, Jason M.
Classification imbalance arises when one class is much rarer than the other. We frame this setting as transfer learning under label (prior) shift between an imbalanced source distribution induced by the observed data and a balanced target distribution under which performance is evaluated. Within this framework, we study a family of oversampling procedures that augment the training data by generating synthetic samples from an estimated minority-class distribution to roughly balance the classes, among which the celebrated SMOTE algorithm is a canonical example. We show that the excess risk decomposes into the rate achievable under balanced training (as if the data had been drawn from the balanced target distribution) and an additional term, the cost of transfer, which quantifies the discrepancy between the estimated and true minority-class distributions. In particular, we show that the cost of transfer for SMOTE dominates that of bootstrapping (random oversampling) in moderately high dimensions, suggesting that we should expect bootstrapping to have better performance than SMOTE in general. We corroborate these findings with experimental evidence. More broadly, our results provide guidance for choosing among augmentation strategies for imbalanced classification.
- North America > United States > New Jersey > Mercer County > Princeton (0.40)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Learning Mixture Models via Efficient High-dimensional Sparse Fourier Transforms
Kalavasis, Alkis, Kothari, Pravesh K., Li, Shuchen, Zampetakis, Manolis
In this work, we give a ${\rm poly}(d,k)$ time and sample algorithm for efficiently learning the parameters of a mixture of $k$ spherical distributions in $d$ dimensions. Unlike all previous methods, our techniques apply to heavy-tailed distributions and include examples that do not even have finite covariances. Our method succeeds whenever the cluster distributions have a characteristic function with sufficiently heavy tails. Such distributions include the Laplace distribution but crucially exclude Gaussians. All previous methods for learning mixture models relied implicitly or explicitly on the low-degree moments. Even for the case of Laplace distributions, we prove that any such algorithm must use super-polynomially many samples. Our method thus adds to the short list of techniques that bypass the limitations of the method of moments. Somewhat surprisingly, our algorithm does not require any minimum separation between the cluster means. This is in stark contrast to spherical Gaussian mixtures where a minimum $\ell_2$-separation is provably necessary even information-theoretically [Regev and Vijayaraghavan '17]. Our methods compose well with existing techniques and allow obtaining ''best of both worlds" guarantees for mixtures where every component either has a heavy-tailed characteristic function or has a sub-Gaussian tail with a light-tailed characteristic function. Our algorithm is based on a new approach to learning mixture models via efficient high-dimensional sparse Fourier transforms. We believe that this method will find more applications to statistical estimation. As an example, we give an algorithm for consistent robust mean estimation against noise-oblivious adversaries, a model practically motivated by the literature on multiple hypothesis testing. It was formally proposed in a recent Master's thesis by one of the authors, and has already inspired follow-up works.
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (3 more...)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)