This will be my third post on my series A 2021 Guide to improving CNNs. Optimizers can be explained as a mathematical function to modify the weights of the network given the gradients and additional information, depending on the formulation of the optimizer. Optimizers are built upon the idea of gradient descent, the greedy approach of iteratively decreasing the loss function by following the gradient. Such functions can be as simple as subtracting the gradients from the weights, or can also be very complex. Better optimizers are mainly focused on being faster and efficient but are also often known to generalize well(less overfitting) compared to others.
Jun-21-2021, 07:05:20 GMT