SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

Peng, Yameng, Song, Andy, Fayek, Haytham M., Ciesielski, Vic, Chang, Xiaojun

arXiv.org Artificial Intelligence 

Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric. It measures the expressivity of networks over a batch of input samples. The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks, outperforming 15 existing training-free metrics on NAS-Bench-101/201/301 and TransNAS-Bench-101. The SWAP-Score can be further enhanced by regularisation, which leads to even higher correlations in cell-based search space and enables model size control during the search. For example, Spearman's rank correlation coefficient between regularised SWAP-Score and CIFAR-100 validation accuracies on NAS-Bench-201 networks is 0.90, significantly higher than 0.80 from the second-best metric, NWOT. When integrated with an evolutionary algorithm for NAS, our SWAP-NAS achieves competitive performance on CIFAR-10 and ImageNet in approximately 6 minutes and 9 minutes of GPU time respectively. Performance evaluation of neural networks is critical, especially in Neural Architecture Search (NAS) which aims to automatically construct high-performing neural networks for a given task. The conventional approach evaluates candidate networks by feed-forward and back-propagation training. This process typically requires every candidate to be trained on the target dataset until convergence (Liu et al., 2019; Zoph & Le, 2017), and often leads to prohibitively high computational cost (Ren et al., 2022; White et al., 2023). To mitigate this cost, several alternatives have been introduced, such as performance predictors, architecture comparators and weight-sharing strategies. A divergent approach is the use of training-free metrics, also known as zero-cost proxies (Chen et al., 2021a; Lin et al., 2021; Lopes et al., 2021; Mellor et al., 2021; Mok et al., 2022; Tanaka et al., 2020b; Li et al., 2023). The aim is to eliminate the need for network training entirely. These metrics are either positively or negatively correlated with the networks' ground-truth performance.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found