Review for NeurIPS paper: Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

Jan-25-2025, 10:40:11 GMT–Neural Information Processing Systems

Additional Feedback: - Two-cluster case is a convex optimization of the linear model and has been investigated in a bit different context [21]. Therefore, the three cluster case is more untrivial and exciting. However, I am not sure that the DMFT formulation in the three-cluster case is tractable enough to analyze SGD dynamics' behavior. Since the three-cluster case is non-convex optimization, I suspect that DMFT equations (20) have some local optima. If this is the case, it becomes unclear how typical the dynamics shown in experiments on three-cluster cases are.

dynamical mean-field theory, gaussian mixture classification, stochastic gradient descent, (6 more...)

Neural Information Processing Systems

Jan-25-2025, 10:40:11 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)