Gradient Descent vs Stochastic GD vs Mini-Batch SGD
Warning: Just in case the terms "partial derivative" or "gradient" sound unfamiliar, I suggest checking out these resources! Gradient descent is an iterative algorithm whose purpose is to make changes to a set of parameters (i.e. A loss or cost or objective function (any of these naming conventions work in practice) is the function whose value we seek to minimize. When performing Gradient descent, each time we update the parameters, we expect to observe a change in min f(w). That is at each iteration, the gradient of the function that contains parameters in w is taken so that changes in the function with respect to parameters brings us closer to the goal of reaching an optimal set of parameters that will ultimately lead to the lowest possible loss function value.
Mar-10-2021, 19:35:19 GMT
- Technology: