Review for NeurIPS paper: Reparameterizing Mirror Descent as Gradient Descent
–Neural Information Processing Systems
Additional Feedback: Suggestions: Lack definition (anything that is not'common knowledge' should be defined and explained before using. Should not let readers guess.) 1. In eq(1), w and L is used without defined. Could first introduce the problem and mention L is loss or the target function, and w is the model parameter. 'coincides with' is not a commonly used, mathematically rigorous and clear expression.
Neural Information Processing Systems
Jan-24-2025, 22:00:56 GMT
- Technology: