Tiered Pruning for Efficient Differentialble Inference-Aware Neural Architecture Search
Kierat, Sławomir, Sieniawski, Mateusz, Fridman, Denys, Yu, Chen-Han, Migacz, Szymon, Morkisz, Paweł, Florea, Alex-Fit
–arXiv.org Artificial Intelligence
We propose three novel pruning techniques to improve the cost and results of Inference-Aware Differentiable Neural Architecture Search (DNAS). First, we introduce Prunode, a stochastic bi-path building block for DNAS, which can search over inner hidden dimensions with O(1) memory and compute complexity. Second, we present an algorithm for pruning blocks within a stochastic layer of the SuperNet during the search. Third, we describe a novel technique for pruning unnecessary stochastic layers during the search. New concepts in NAS succeed with the evergrowing search space, increasing the dimensionality and complexity of the problem. Balancing the search-cost and quality of the search hence is essential for employing NAS in practice. Traditional NAS methods require evaluating many candidate networks to find optimized ones with respect to the desired metric. This approach can be successfully applied to simple problems like CIFAR-10 Krizhevsky et al. (2010), but for more demanding problems, these methods may turn out to be computationally prohibitive. To minimize this computational cost, recent research has focused on partial training Falkner et al. (2018); Li et al. (2020a); Luo et al. (2018), performing network morphism Cai et al. (2018a); Jin et al. (2019); Molchanov et al. (2021) instead of training from scratch, or training many candidates at the same time by sharing the weights Pham et al. (2018). These approaches can save computational time, but their reliability is questionable Bender et al. In our experiments, we focus on a search space based on a state-of-the-art network to showcase the value of our methodology.
arXiv.org Artificial Intelligence
Jan-5-2023