A Visual Explanation of Gradient Descent Methods (Momentum, AdaGrad, RMSProp, Adam)
With a myriad of resources out there explaining gradient descents, in this post, I'd like to visually walk you through how each of these methods works. With the aid of a gradient descent visualization tool I built, hopefully I can present you with some unique insights, or minimally, many GIFs. I assume basic familiarity of why and how gradient descent is used in machine learning (if not, I recommend this video by 3Blue1Brown) . My focus here is to compare and contrast these methods. If you are already familiar with all the methods, you can scroll to the bottom to watch a few fun "horse races".
Jun-10-2020, 16:01:32 GMT
- Technology: