Structural Return Maximization for Reinforcement Learning