Dive Into Deep Learning -- Part 2. This is part 2 of my summary of the…

Feb-26-2023, 08:15:08 GMT–#artificialintelligence

The naive approach: Take the derivative of the loss function which is an average of the losses calculated on every example in the dataset, a full update is powerful but it has some drawbacks… Drawbacks: . Can be extremely slow as we need to pass over the entire dataset to make a single update. . If there is a lot of redundancy in the training data, the benefit of a full update is very low The extreme approach Consider only a single example at a time and update steps based on one observation at a time, does that remind you of something?? Yes, it's the stochastic gradient descent algorithm or SGD. It can be effective even in large datasets but it also has some drawbacks… Drawbacks: . It can take longer to process one sample at a time compared to a full batch .

dataset, drawback, full update, (10 more...)

#artificialintelligence

Feb-26-2023, 08:15:08 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning > Gradient Descent (0.66)
  - Neural Networks > Deep Learning (0.40)