Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms