EXPO: Stable Reinforcement Learning with Expressive Policies