What is the long-run distribution of stochastic gradient descent? A large deviations analysis