Statistical-Query Lower Bounds via Functional Gradients
–Neural Information Processing Systems
We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning a ReLU), we show that any statistical-query algorithm with tolerance n {-(1/\epsilon) b} must use at least 2 {n c} \epsilon queries for some constants b, c 0, where n is the dimension and \epsilon is the accuracy parameter. Our results rule out {\em general} (as opposed to correlational) SQ learning algorithms, which is unusual for real-valued learning problems. Our techniques involve a gradient boosting procedure for amplifying'' recent lower bounds due to Diakonikolas et al.\ (COLT 2020) and Goel et al.\ (ICML 2020) on the SQ dimension of functions computed by two-layer neural networks. The crucial new ingredient is the use of a nonstandard convex functional during the boosting procedure.
Neural Information Processing Systems
Oct-9-2024, 15:13:16 GMT
- Technology: