Goto

Collaborating Authors

 Computational Learning Theory


High-Accuracy List-Decodable Mean Estimation

arXiv.org Machine Learning

In list-decodable learning, we are given a set of data points such that an $ฮฑ$-fraction of these points come from a nice distribution $D$, for some small $ฮฑ\ll 1$, and the goal is to output a short list of candidate solutions, such that at least one element of this list recovers some non-trivial information about $D$. By now, there is a large body of work on this topic; however, while many algorithms can achieve optimal list size in terms of $ฮฑ$, all known algorithms must incur error which decays, in some cases quite poorly, with $1 / ฮฑ$. In this paper, we ask if this is inherent: is it possible to trade off list size with accuracy in list-decodable learning? More formally, given $ฮต> 0$, can we can output a slightly larger list in terms of $ฮฑ$ and $ฮต$, but so that one element of this list has error at most $ฮต$ with the ground truth? We call this problem high-accuracy list-decodable learning. Our main result is that non-trivial high-accuracy guarantees, both information-theoretically and algorithmically, are possible for the canonical setting of list-decodable mean estimation of identity-covariance Gaussians. Specifically, we demonstrate that there exists a list of candidate means of size at most $L = \exp \left( O\left( \tfrac{\log^2 1 / ฮฑ}{ฮต^2} \right)\right)$ so that one of the elements of this list has $\ell_2$ distance at most $ฮต$ to the true mean. We also design an algorithm that outputs such a list with runtime and sample complexity $n = d^{O(\log L)} + \exp \exp (\widetilde{O}(\log L))$. We do so by demonstrating a completely novel proof of identifiability, as well as a new algorithmic way of leveraging this proof without the sum-of-squares hierarchy, which may be of independent technical interest.


Smoothed Agnostic Learning of Halfspaces over the Hypercube

arXiv.org Machine Learning

Agnostic learning of Boolean halfspaces is a fundamental problem in computational learning theory, but it is known to be computationally hard even for weak learning. Recent work [CKKMK24] proposed smoothed analysis as a way to bypass such hardness, but existing frameworks rely on additive Gaussian perturbations, making them unsuitable for discrete domains. We introduce a new smoothed agnostic learning framework for Boolean inputs, where perturbations are modeled via random bit flips. This defines a natural discrete analogue of smoothed optimality generalizing the Gaussian case. Under strictly subexponential assumptions on the input distribution, we give an efficient algorithm for learning halfspaces in this model, with runtime and sample complexity approximately n raised to a poly(1/(sigma * epsilon)) factor. Previously, such algorithms were known only with strong structural assumptions for the discrete hypercube, for example, independent coordinates or symmetric distributions. Our result provides the first computationally efficient guarantee for smoothed agnostic learning of halfspaces over the Boolean hypercube, bridging the gap between worst-case intractability and practical learnability in discrete settings.


Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces

Neural Information Processing Systems

It has been a long-standing problem to efficiently learn a halfspace using as few labels as possible in the presence of noise. In this work, we propose an efficient Perceptron-based algorithm for actively learning homogeneous halfspaces under the uniform distribution over the unit sphere.


Sample and Computationally Efficient Learning Algorithms under S-Concave Distributions

Neural Information Processing Systems

Developing provable learning algorithms is one of the central challenges in learning theory. The study of such algorithms has led to significant advances in both the theory and practice of passive and active learning.






A Primer on Quantum Machine Learning

arXiv.org Machine Learning

Quantum machine learning (QML) is a computational paradigm that seeks to apply quantum-mechanical resources to solve learning problems. As such, the goal of this framework is to leverage quantum processors to tackle optimization, supervised, unsupervised and reinforcement learning, and generative modeling-among other tasks-more efficiently than classical models. Here we offer a high level overview of QML, focusing on settings where the quantum device is the primary learning or data generating unit. We outline the field's tensions between practicality and guarantees, access models and speedups, and classical baselines and claimed quantum advantages-flagging where evidence is strong, where it is conditional or still lacking, and where open questions remain. By shedding light on these nuances and debates, we aim to provide a friendly map of the QML landscape so that the reader can judge when-and under what assumptions-quantum approaches may offer real benefits.


Tight Bounds for Collaborative PAC Learning via Multiplicative Weights

Neural Information Processing Systems

We study the collaborative PAC learning problem recently proposed in Blum et al.~\cite{BHPQ17}, in which we have $k$ players and they want to learn a target function collaboratively, such that the learned function approximates the target function well on all players' distributions simultaneously. The quality of the collaborative learning algorithm is measured by the ratio between the sample complexity of the algorithm and that of the learning algorithm for a single distribution (called the overhead). We obtain a collaborative learning algorithm with overhead $O(\ln k)$, improving the one with overhead $O(\ln^2 k)$ in \cite{BHPQ17}. We also show that an $\Omega(\ln k)$ overhead is inevitable when $k$ is polynomial bounded by the VC dimension of the hypothesis class. Finally, our experimental study has demonstrated the superiority of our algorithm compared with the one in Blum et al.~\cite{BHPQ17} on real-world datasets.