chebygin
4c5bcfec8584af0d967f1ab10179ca4b-AuthorFeedback.pdf
For more reliable comparison, we repeat experiments for100random seedsinstead of 10. "init tune" denotes tuningฯ and choosing betweenN or U (see Figure 1 at the bottom); tuning isdone in the same wayasforotherhyperparameters. We will also add results of GCN supporting our conclusions (Table 115 and Figure 1). Note20 that in Table 1 of the submitted paper, forCOLORSand MNIST-75sp,21 ChebyGINs are equivalent to ChebyNets as described in Table 1 of22 theSupplementary material and elaborated onfollowing that table (see23 footnote3). In our model, the features are25 weighted by attention scores according to Eq. 3, so it is soft. In this26 case, the features indeed reduce their scale.
Reviews: Understanding Attention and Generalization in Graph Neural Networks
UPDATE: I have increased the score to 6 as long as the authors will revise the paper as promised in the responses. This paper has more than one topic being discussed. It at the first part talks mostly about the attention mechanism, and in the second section it introduces a new model ChebyGIN, then in the third section it proposed a weakly-supervised attention training approach. Overall, the paper is not all about its title "Understanding Attention in Graph Neural Networks". In 2.3 the paper says "the performance of both GCNs and GINs is quite poor and, consequently, it is also hard for the attention subnetwork to learn", thus it proposes ChebyGIN as a stronger model.
Understanding attention in graph neural networks
Knyazev, Boris, Taylor, Graham W., Amer, Mohamed R.
We aim to better understand attention over nodes in graph neural networks and identify factors influencing its effectiveness. Motivated by insights from the work on Graph Isomorphism Networks (Xu et al., 2019), we design simple graph reasoning tasks that allow us to study attention in a controlled environment. We find that under typical conditions the effect of attention is negligible or even harmful, but under certain conditions it provides an exceptional gain in performance of more than 40% in some of our classification tasks. However, we have yet to satisfy these conditions in practice.