On Accelerated Perceptrons and Beyond
Wang, Guanghui, Hanashiro, Rafael, Guha, Etash, Abernethy, Jacob
–arXiv.org Artificial Intelligence
The classical Perceptron algorithm of Rosenblatt can be used to find a linear threshold function to correctly classify $n$ linearly separable data points, assuming the classes are separated by some margin $\gamma > 0$. A foundational result is that Perceptron converges after $\Omega(1/\gamma^{2})$ iterations. There have been several recent works that managed to improve this rate by a quadratic factor, to $\Omega(\sqrt{\log n}/\gamma)$, with more sophisticated algorithms. In this paper, we unify these existing results under one framework by showing that they can all be described through the lens of solving min-max problems using modern acceleration techniques, mainly through optimistic online learning. We then show that the proposed framework also lead to improved results for a series of problems beyond the standard Perceptron setting. Specifically, a) For the margin maximization problem, we improve the state-of-the-art result from $O(\log t/t^2)$ to $O(1/t^2)$, where $t$ is the number of iterations; b) We provide the first result on identifying the implicit bias property of the classical Nesterov's accelerated gradient descent (NAG) algorithm, and show NAG can maximize the margin with an $O(1/t^2)$ rate; c) For the classical $p$-norm Perceptron problem, we provide an algorithm with $\Omega(\sqrt{(p-1)\log n}/\gamma)$ convergence rate, while existing algorithms suffer the $\Omega({(p-1)}/\gamma^2)$ convergence rate.
arXiv.org Artificial Intelligence
Oct-17-2022
- Country:
- North America > United States
- Massachusetts > Middlesex County
- Cambridge (0.14)
- Georgia > Fulton County
- Atlanta (0.04)
- Massachusetts > Middlesex County
- Asia > Middle East
- Palestine > Gaza Strip > Rafah Governorate > Rafah (0.04)
- North America > United States
- Genre:
- Research Report (0.50)
- Industry:
- Education (0.34)
- Technology: