Bias Mitigation via Compensation: A Reinforcement Learning Perspective