Lagrangian Duality in Reinforcement Learning