One of the more painful things about training Deep Neural Networks is the large number of hyperparameters you have to deal with. These could be your learning rate α, the discounting factor ρ, and epsilon ε if you are using the RMSprop optimizer (Hinton et al.) or the exponential decay rates β₁ and β₂ if you are using the Adam optimizer (Kingma et al.). You also need to choose the number of layers in the network or the number of hidden units for the layers. You might be using learning rate schedulers and would want to configure those features and a lot more! We definitely need ways to better organize our hyperparameter tuning process. A common algorithm I tend to use to organize my hyperparameter search process is Random Search.
Apr-29-2021, 05:45:18 GMT