GPU-Accelerated Primal Learning for Extremely Fast Large-Scale Classification: Supplementary Material
–Neural Information Processing Systems
Speedups were tested for both batch gradient descent (with a 0.001 learning rate) and L-BFGS . Let 1 denote the indicator function. TRON is detailed in Algorithm 1. The other direction is slightly different.
Neural Information Processing Systems
Nov-13-2025, 22:11:23 GMT