Hardness of Learning Fixed Parities with Neural Networks

Jan-8-2025–arXiv.org Machine Learning

Learning parity functions is a canonical problem in learning theory, which although computationally tractable, is not amenable to standard learning algorithms such as gradient-based methods. This hardness is usually explained via statistical query lower bounds [Kearns, 1998]. However, these bounds only imply that for any given algorithm, there is some worst-case parity function that will be hard to learn. Thus, they do not explain why fixed parities - say, the full parity function over all coordinates - are difficult to learn in practice, at least with standard predictors and gradient-based methods [Abbe and Boix-Adsera, 2022]. In this paper, we address this open problem, by showing that for any fixed parity of some minimal size, using it as a target function to train one-hidden-layer ReLU networks with perturbed gradient descent will fail to produce anything meaningful. To establish this, we prove a new result about the decay of the Fourier coefficients of linear threshold (or weighted majority) functions, which may be of independent interest.

algorithm, fourier coefficient, parity function, (13 more...)

arXiv.org Machine Learning

Jan-8-2025

arXiv.org PDF

Add feedback

Country:
- South America > Brazil
  - São Paulo (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report > New Finding (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found