Training your Neural Network with Cyclical Learning Rates – MachineCurve

Dec-3-2020, 04:50:30 GMT–#artificialintelligence

At a high level, training supervised machine learning models involves a few easy steps: feeding data to your model, computing loss based on the differences between predictions and ground truth, and using loss to improve the model with an optimizer. For example, it's possible to choose multiple optimizers – ranging from traditional Stochastic Gradient Descent to adaptive optimizers, which are also very common today. Say that you settle for the first – Stochastic Gradient Descent (SGD). Likely, in your deep learning framework, you'll see that the learning rate is a parameter that can be configured, with a default value that is preconfigured most of the times. Now, what is this learning rate? Why do we need them?

cyclical learning rate, saddle point, smith, (12 more...)

#artificialintelligence

Dec-3-2020, 04:50:30 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.95)