Review for NeurIPS paper: Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification
–Neural Information Processing Systems
Additional Feedback: - Two-cluster case is a convex optimization of the linear model and has been investigated in a bit different context [21]. Therefore, the three cluster case is more untrivial and exciting. However, I am not sure that the DMFT formulation in the three-cluster case is tractable enough to analyze SGD dynamics' behavior. Since the three-cluster case is non-convex optimization, I suspect that DMFT equations (20) have some local optima. If this is the case, it becomes unclear how typical the dynamics shown in experiments on three-cluster cases are.
dynamical mean-field theory, gaussian mixture classification, stochastic gradient descent, (6 more...)
Neural Information Processing Systems
Jan-25-2025, 10:40:11 GMT
- Technology: