Reviews: Rethinking Kernel Methods for Node Representation Learning on Graphs

Neural Information Processing Systems 

After rebuttal: thank you for the additional experiments. They strengthen the empirical contribution of the paper, so I've increased my score to a 7. ________________ Originality: The paper is a novel combination of known techniques: by reinterpreting the the iterative node aggregation procedure of Kipf et al's GCN as feature smoothing technique, they develop a novel feature mapping function for learning positive semi-definite (psd) graph kernels. The key difference from the Kipf et al approach is they separate the node aggregation and non-linear representation learning components: node features are the output of a multi-layer perceptron and then aggregated once (rather than at every layer) by a multi-hop aggregation function. They argue theoretically that this approach is universal in the sense that it can approximate any invertible psd kernel. Quality: I thought the empirical results of the paper were interesting because they suggest that decoupling the aggregation and representation learning components of GCN-style models leads to better performance (at least on these datasets).