Improving End-To-End Modeling for Mispronunciation Detection with Effective Augmentation Mechanisms