Maximum Reward Formulation In Reinforcement Learning