As Reviewer 2 suggested, we have added the PNA without std and

Neural Information Processing Systems 

Firstly we would like to thank all the reviewers for their very insightful comments and suggestions. We understand Reviewers 2 and 3's concern with the non-standard architecture using GRU, We will incorporate this discussion within our paper. Finally, we want to thank Reviewer 4 for bringing to our attention the interesting work by Lee et al. on mixed pooling