Relative Entropy Regularized Reinforcement Learning for Efficient Encrypted Policy Synthesis