r/MachineLearning - [D] What is the best way to search for a learning rate schedule?
In general, the hyperparams are related - if you perturb one hyperparam, you need to perturb some other hyperparams also to get satisfactory results. Some people do a random search on their hyperparam grid but if one hyperparam is very sensitive to changes in the other hyperparams, then the search will be more difficult. Personally, I've had OK results using Cyclic Learning Rate together with batchnorm and only have 3 values for the max-learning-rate hyperparam in my hyperparam grid. However, you probably won't find many papers on CLR because its efficacy and the details of the right way to use it is probably quite problem-specific and there's very little theory behind it even by deep-learning standards.
Dec-13-2019, 11:39:35 GMT
- Technology: