Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs