Reviews: SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques
–Neural Information Processing Systems
The paper is overall clearly written, but one important aspect of the algorithm remains not sufficiently expounded: how precisely the subspace optimization is carried over. The paper only mentions in passing that it uses conjugate gradient (CG), but a number of points would deserve further clarification: a) is CG done over a *single* larger minibatch? And how precisely is this minibatch chosen. Which version/implementation do you use? The computational cost *and* additional memory requirement (as this can constitute a practical limitation for large nets) for the subspace optimization would need to be disclosed and made precise.
Neural Information Processing Systems
Jan-20-2025, 17:42:36 GMT
- Technology: