f1c1592588411002af340cbaedd6fc33-Supplemental.pdf

Neural Information Processing Systems 

Figure 2: These two graphs cannot be distinguished by 1-WL-test. The COMBINE step takes the result of AGGREGATE and the previous representation of current node asinput. Wereduce theFFN inner-layer dimension of4din [47] tod, which does not appreciably hurt the performance but significantly save the parameters. The embedding dropout ratio is set to 0.1 by default in many previous Transformer works[11,34]. The rest of hyper-parameters remain unchanged. Table 8 summarizes the hyper-parameters used for fine-tuning Graphormer on OGBGMolPCBA.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found