Reviews: On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks

Neural Information Processing Systems 

The paper make use of (relatively) recent advances in complexity theory to show that of many common learning problems do not allow subquadratic time learning algorithms (given the veracity of the "Strong Exponential Time Hypothesis"). I appreciate that the authors do not oversell their results: They clearly state that they provide a worst-case analysis. Also, the results are not surprising. For instance, finding the exact solution of any kernel method requires the computation of the full kernel matrix, which is already quadratic in number of training examples. Reducing this computation time would imply that one can compute an approximation of the exact solution without computing the full kernel matrix, which is intuitively unlikely, unless he makes extra assumptions on the problem structure (i.e., the nature of the data-generating distribution).