Evaluating and Improving Context Attention Distribution on Multi-Turn Response Generation using Self-Contained Distractions