Policy Gradient for Rectangular Robust Markov Decision Processes