Review for NeurIPS paper: Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks

Neural Information Processing Systems 

Weaknesses: ** Training methodology is not properly mentioned. It is a regression-based loss on the numerical value? It would be good if you can mention how it is related to IPA-GNN. GATs also allow attending using different weights to incoming messages from neighbors. Note that GAT is a kind of convolution-based GNN and does not use a recurrent unit, so you will have to adapt the attention mechanism in the context of GGNN.