Goto

Collaborating Authors

 function class


Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

Blanchard, Moïse, Shetty, Abhishek, Rakhlin, Alexander

arXiv.org Machine Learning

Understanding minimal assumptions that enable learning and generalization is perhaps the central question of learning theory. Several celebrated results in statistical learning theory, such as the VC theorem and Littlestone's characterization of online learnability, establish conditions on the hypothesis class that allow for learning under independent data and adversarial data, respectively. Building upon recent work bridging these extremes, we study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family $U$ and ask when such problems are learnable with sample complexity that behaves like the favorable independent case. We provide a near complete characterization of families $U$ that admit learnability in terms of a notion known as generalized smoothness i.e. a distribution family admits VC-dimension-dependent regret bounds for every finite-VC hypothesis class if and only if it is generalized smooth. Further, we give universal algorithms that achieve low regret under any generalized smooth adversary without explicit knowledge of $U$. Finally, when $U$ is known, we provide refined bounds in terms of a combinatorial parameter, the fragmentation number, that captures how many disjoint regions can carry nontrivial mass under $U$. These results provide a nearly complete understanding of learnability under distributional adversaries. In addition, building upon the surprising connection between online learning and differential privacy, we show that the generalized smoothness also characterizes private learnability under distributional constraints.



Adversarially Robust Multi-task Representation Learning

Neural Information Processing Systems

We study adversarially robust transfer learning, wherein, given labeled data on multiple (source) tasks, the goal is to train a model with small robust error on a previously unseen (target) task. In particular, we consider a multi-task representation learning (MTRL) setting, i.e., we assume that the source and target tasks admit a simple (linear) predictor on top of a shared representation (e.g., the final hidden layer of a deep neural network). In this general setting, we provide rates on the excess adversarial (transfer) risk for Lipschitz losses and smooth nonnegative losses. These rates show that learning a representation using adversarial training on diverse tasks helps protect against inference-time attacks in data-scarce environments. Additionally, we provide novel rates for the single-task setting.