80f2f15983422987ea30d77bb531be86-Paper.pdf
–Neural Information Processing Systems
Wethenseparate theoptimization process into two steps, corresponding to weight update and structure parameter update. For the former step, we use the conventional chain rule, which can be sparse via exploiting the sparse structure.
Neural Information Processing Systems
Feb-19-2026, 05:09:18 GMT