SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training

Open in new window