Rating-based Reinforcement Learning