SupplementaryMaterial

Neural Information Processing Systems 

We adopt four bioinformatics datasets in the experiment. Given the input graph, it will randomly add or cut a certain portion ofconnections between nodes withtheprobability of0.2. It will set the feature of 20% nodes in the graph to Gaussian noises with mean and standard deviation is 0.5. We adopt the Adam [5] optimizer, which is a variant of Stochastic Gradient Descent (SGD) with adaptivemoment estimation.