We would like to thank each of the reviewers for the constructive and insightful comments on our manuscript

Neural Information Processing Systems 

We would like to thank each of the reviewers for the constructive and insightful comments on our manuscript. Also, we will further polish our paper based on your suggestions to address other writing issues. The reasons are discussed in lines 308-315 in our paper. R3, R5: Explanation on why to use self-attention. In addition, we agree that it is more realistic to model label uncertainty.