A Large Deviations Perspective on Policy Gradient Algorithms