A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback