Reviews: Understanding Attention and Generalization in Graph Neural Networks

Neural Information Processing Systems 

This paper explores node-wise attention in graph neural networks, with the aim of characterizing when it works well. The authors demonstrate that attention often affords only marginal benefits. They propose a weakly supervised regime that tends to improve performance. The experiments are thorough and presented well. Reviewers have highlighted some presentation issues that should be addressed in future versions of the manuscript.