dc6a7e655d7e5840e66733e9ee67cc69-AuthorFeedback.pdf
–Neural Information Processing Systems
We thank all the reviewers for helpful suggestions. We will incorporate the following analysis into our revision. Firstly, we found 4 typical patterns shared by both, as shown in Figure 1. Attention patterns shared by XLNet and BERT . Rows and columns represent query and key respectively.
Neural Information Processing Systems
Feb-14-2026, 14:43:18 GMT