TNASP: ATransformer-basedNASPredictorwitha Self-evolution Framework-SupplementaryMaterials

Neural Information Processing Systems 

When replacing our Transformer with GCN, we can get the models almost the same as the ones applied inNP (GCN) [14], which isobviously worse than our method. Moreover, our method did not explicitly put the validation data into the training dataset as the pseudo label techniquedid. The comparisons with other methods are summarized in Tab.8 and we visualize our searched architectures inSec.C.2. Inthe MobileNet-like search space, we retrain the searched architecture for 240 epochs with batch size 1024 on8NVIDIAV100 GPUs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found