Backtracking gradient descent method for general $C^1$ functions, with applications to Deep Learning