Zero-Shot Neural Architecture Search: Challenges, Solutions, and Opportunities

Li, Guihong, Hoang, Duc, Bhardwaj, Kartikeya, Lin, Ming, Wang, Zhangyang, Marculescu, Radu

Oct-16-2023–arXiv.org Artificial Intelligence

Abstract--Recently, zero-shot (or training-free) Neural Architecture Search (NAS) approaches have been proposed to liberate NAS from the expensive training process. The key idea behind zero-shot NAS approaches is to design proxies that can predict the accuracy of some given networks without training the network parameters. The proxies proposed so far are usually inspired by recent progress in theoretical understanding of deep learning and have shown great potential on several datasets and NAS benchmarks. This paper aims to comprehensively review and compare the state-of-the-art (SOTA) zero-shot NAS approaches, with an emphasis on their hardware awareness. To this end, we first review the mainstream zero-shot proxies and discuss their theoretical underpinnings. We then compare these zero-shot proxies through large-scale experiments and demonstrate their effectiveness in both hardware-aware and hardware-oblivious NAS scenarios. Finally, we point out several promising ideas to design better proxies. In recent years, deep neural networks have made significant via a hyper-network [11], [32], [33], [34], [35], [36], [37]. As breakthroughs in many applications, such as recommendation shown in Figure 2, one-shot NAS only needs to train a single systems, image classification, and natural language hyper-network instead of multiple candidate architectures modeling [1], [2], [3], [4], [5], [6], [7]. To automatically design whose number is usually exponentially large.

large language model, machine learning, proxy, (21 more...)

arXiv.org Artificial Intelligence

Oct-16-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.28)
  - Texas > Travis County
    - Austin (0.14)

Genre:
- Overview (1.00)

Industry:
- Telecommunications (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found