automate stochastic optimization
Using Statistics to Automate Stochastic Optimization
Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until progress stops, then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters. The statistical testing helps to control the variance of this procedure and improves its robustness.
Reviews: Using Statistics to Automate Stochastic Optimization
This paper studies how to test the stationarity of stochastic gradient with momentum using some advanced testing statistics that take the time correlations into accounts. Extensive experiments are run to demonstrate the advantage of the proposed method over existing approaches. Originality: The paper is based on extending a recent paper by Yaida. It does not seem that original to me but the authors do combine the condition by Yaida with some more advanced testing statistics in a new way. Overall I think the extension is quite natural, so the conceptual novelty is not that high.
Reviews: Using Statistics to Automate Stochastic Optimization
The paper proposes to automate the tuning of learning rate schedules in stochastic gradient methods, which is an important problem. In this regards, the authors propose a statistical test to determine when to decay the learning rate. The statistical test build upon a prior work with simple albeit useful extensions. Resulting statistical test is simple and can be deployed easily. There are some concerns regarding mismatch between theoretical assumptions made and the setup in practice. Nevertheless, empirically the learning rate schedule followed by decaying when the test is true seems to be almost competitive with hand-tuned methods.
Using Statistics to Automate Stochastic Optimization
Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops," then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters. The statistical testing helps to control the variance of this procedure and improves its robustness.
Using Statistics to Automate Stochastic Optimization
Lang, Hunter, Xiao, Lin, Zhang, Pengchuan
Despite the development of numerous adaptive optimizers, tuning the learning rate of stochastic gradient methods remains a major roadblock to obtaining good practical performance in machine learning. Rather than changing the learning rate at each iteration, we propose an approach that automates the most common hand-tuning heuristic: use a constant learning rate until "progress stops," then drop. We design an explicit statistical test that determines when the dynamics of stochastic gradient descent reach a stationary distribution. This test can be performed easily during training, and when it fires, we decrease the learning rate by a constant multiplicative factor. Our experiments on several deep learning tasks demonstrate that this statistical adaptive stochastic approximation (SASA) method can automatically find good learning rate schedules and match the performance of hand-tuned methods using default settings of its parameters.