Expressing Arbitrary Reward Functions as Potential-Based Advice