Human Preference Scaling with Demonstrations For Deep Reinforcement Learning