Optimization techniques: Finding maxima and minima
I assume the small difference is because of the approximation differences in R. However the results are very close. One important point to remember about gradient descent is how we choose the learning rate. If it is too small, it will take forever to converge but if it is too large, we will never find a minima. From the plot above it easily understood that, in the initial stages, we were able to take big steps. As we get closer to the minima the descent starts slowing down.
Jan-4-2018, 17:45:44 GMT
- Technology: