pflug
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
- North America > United States > Washington > King County > Redmond (0.04)
- Europe > Russia (0.04)
- (3 more...)
Using Statistics to Automate Stochastic Optimization
Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops", then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Research Report (0.46)
- Instructional Material (0.46)
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Pesme, Scott, Dieuleveut, Aymeric, Flammarion, Nicolas
Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-world datasets.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Ohio (0.04)
Using Statistics to Automate Stochastic Optimization
Lang, Hunter, Zhang, Pengchuan, Xiao, Lin
Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops," then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters. The statistical testing helps to control the variance of this procedure and improves its robustness.
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Russia (0.04)
- (3 more...)