Escaping from Saddle Points

@machinelearnbot 

Non-convex functions can be much more complicated. In this post we will discuss various types of critical points that you might encounter when you go off the convex path. In particular, we will see in many cases simple heuristics based on gradient descent can lead you to a local minimum in polynomial time. Here \eta is a small step size. This is the gradient descent algorithm.