Collaborating Authors

machine learning .p11 - Little maths behind Gradient Descent. [Hindi]


Hello geeks, this is the 11th video of machine learning tutorial. In this video we'll talk about a very little maths behind gradient descent method, for that you don't need to be a great mathematician, a high school student can easily understand the concept.

Implementing Gradient Decent using Numpy matrix multiplication


In the previous section, taught you about the log-loss function. There are many other error functions used for neural networks. Let me teach you another one, called the mean squared error. As the name says, this one is the mean of the squares of the differences between the predictions and the labels. In the following section I'll go over it in detail, then we'll get to implement backpropagation with it on the same student admissions dataset. And as a bonus, we'll be implementing this in a very effective way using matrix multiplication with NumPy! We want to find the weights for our neural networks.

VAE Learning via Stein Variational Gradient Descent

Neural Information Processing Systems

A new method for learning variational autoencoders (VAEs) is developed, based on Stein variational gradient descent. A key advantage of this approach is that one need not make parametric assumptions about the form of the encoder distribution. Performance is further enhanced by integrating the proposed encoder with importance sampling. Excellent performance is demonstrated across multiple unsupervised and semi-supervised problems, including semi-supervised analysis of the ImageNet data, demonstrating the scalability of the model to large datasets. Papers published at the Neural Information Processing Systems Conference.

r/VisualMath - Gradient descent at the very core of Artificial Intelligence


In the end training a network is the solution of a very high dimensional non-linear optimization problem ( finding the minimum of a function). The graphic shows a two dimensional optimization problem and how the gradient descent algorithm aproaches the minimum ( center). In 2D this is trivial, in higher dimensions computationally intensive. The slider sets the step size. You want big steps to find the solution fast, but if the step size gets to big, the optimizer starts to oscilate.