Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning