Appendix

Neural Information Processing Systems 

This is the appendix of our work: 'GNNEvaluator: Evaluating GNN Performance On Unseen We provide the details of dataset statistics used in our experiments in Table. For all GNN and MLP models, the default settings are: (a) the number of layers is 2; (b) the hidden feature dimension is 128; (c) the output feature dimension before the softmax operation is 16. The hyperparameters of training these GNNs and MLP are shown in Table A2. As a vital component of our proposed two-stage GNN model evaluation framework, DiscGraph set captures wide-range and diverse graph data distribution discrepancies. In Fig. A1, we present more visualization results on discrepancy node attributes in the proposed DiscGraph set for different GNN models, i.e., (a) GA T, (b) GraphSAGE, and (c) GIN, under