Regret-Based Optimization for Robust Reinforcement Learning