Efficient Average Reward Reinforcement Learning Using Constant Shifting Values