Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation