What is "Stochastic" in Stochastic Gradient Descent (SGD)


Over the past 5 months, I had been reading the book Probability Essentials by Jean Jacod and Philip Protter, and the more time I spent on it, more I started to treat every encounter with Probability with a rigorous perspective. Recently, I was reading a paper in Deep Learning and the authors were talking about Stochastic Gradient Descent (SGD), which got me thinking, why is it called "stochastic"? Where is the randomness in it? Disclaimer: I won't be trying to explain any mathematical bits in this article solely because it is a pain to add equations. I hope the reader has some familiarity with the mathematical bits of the Gradient Descent algorithm and its variants. I'll provide a brief introduction where necessary, but won't be going into much detail.

Duplicate Docs Excel Report

None found

Similar Docs  Excel Report  more

None found