The expressiveness of search space is a key concern in neural architecture search (NAS). Previous approaches are mainly limited to searching for single-path networks. Incorporating multi-path search space with the current one-shot doctrine remains untackled. In this paper, we investigate the supernet behavior under multi-path's setting. We show that a trivial generalization from single-path to multi-path incurs severe feature inconsistency, which deteriorates both supernet training stability and model ranking ability. To remedy this degradation, we employ what we term as shadow batch normalizations (SBN) to catch changing statistics when activating different sets of paths. Extensive experiments on a common NAS benchmark, NAS-bench-101, show that SBN can boost ranking performance at neglectable cost. It breaks the Kendall Tau's record with a clear margin, reaching 0.597. Moreover, we take advantage of feature similarities on activated paths to largely reduce the number of needed SBNs. We call our method MixPath. When proxylessly searching on ImageNet, we obtain several lightweight models that outperform EfficientNet-B0 with fewer FLOPs, parameters and 300x fewer searching resources. Our code will be available https://github.com/xiaomi-automl/MixPath.git .
The evolution of MobileNets has laid a solid foundation for neural network application on the mobile end. With the latest MobileNetV3, neural architecture search again claimed its supremacy on network design. Till today all mobile methods mainly focus on CPU latency instead of GPU, the latter, however, has lower overhead and interference and is much preferred in the industry. To mitigate this gap, we propose the first Mobile GPU-Aware (MoGA) neural architecture search in order to be precisely tailored for real-world applications. Further, the ultimate objective to devise a mobile network lies in achieving better performance by maximizing the utilization of bounded resources. While urging higher capability and restraining time consumption, we unconventionally encourage increasing the number of parameters for higher representational power. Undoubtedly, these three forces are not reconcilable and we have to alleviate the tension by weighted evolution techniques. Lastly, we deliver our searched networks at a mobile scale that outperform MobileNetV3 under the similar latency constraints, i.e., MoGA-A achieves 75.9\% top-1 accuracy on ImageNet, MoGA-B meets 75.5\% which costs only 0.5ms more on mobile GPU than MobileNetV3, which scores 75.2\%. MoGA-C best attests GPU-awareness by reaching 75.3\% and being slower on CPU but faster on GPU. The models and test code is made available here https://github.com/xiaomi-automl/MoGA.
One-Shot methods have evolved into one of the most popular methods in Neural Architecture Search (NAS) due to weight sharing and single training of a supernet. However, existing methods generally suffer from two issues: predetermined number of channels in each layer which is suboptimal; and model averaging effects and poor ranking correlation caused by weight coupling and continuously expanding search space. To explicitly address these issues, in this paper, a Broadening-and-Shrinking One-Shot NAS (BS-NAS) framework is proposed, in which `broadening' refers to broadening the search space with a spring block enabling search for numbers of channels during training of the supernet; while `shrinking' refers to a novel shrinking strategy gradually turning off those underperforming operations. The above innovations broaden the search space for wider representation and then shrink it by gradually removing underperforming operations, followed by an evolutionary algorithm to efficiently search for the optimal architecture. Extensive experiments on ImageNet illustrate the effectiveness of the proposed BS-NAS as well as the state-of-the-art performance.
The ability to rank models by its real strength is the key to Neural Architecture Search. Traditional approaches adopt an incomplete training for such purpose which is still very costly. One-shot methods are thus devised to cut the expense by reusing the same set of weights. However, it is uncertain whether shared weights are truly effective. It is also unclear if a picked model is better because of its vigorous representational power or simply because it is overtrained. In order to remove the suspicion, we propose a novel idea called Fair Neural Architecture Search (FairNAS), in which a strict fairness constraint is enforced for fair inheritance and training. In this way, our supernet exhibits nice convergence and very high training accuracy. The performance of any sampled model loaded with shared weights from the supernet strongly correlates with that of stand-alone counterpart when trained fully. This result dramatically improves the searching efficiency, with a multi-objective reinforced evolutionary search backend, our pipeline generated a new set of state-of-the-art architectures on ImageNet: FairNAS-A attains 75.34% top-1 validation accuracy on ImageNet, FairNAS-B 75.10%, FairNAS-C 74.69%, even with lower multi-adds and/or fewer number of parameters compared with others. The models and their evaluation code are made publicly available online http://github.com/fairnas/FairNAS.
Convolutional neural networks are widely adopted in Acoustic Scene Classification (ASC) tasks, but they generally carry a heavy computational burden. In this work, we propose a lightweight yet high-performing baseline network inspired by MobileNetV2, which replaces square convolutional kernels with unidirectional ones to extract features alternately in temporal and frequency dimensions. Furthermore, we explore a dynamic architecture space built on the basis of the proposed baseline with the recent Neural Architecture Search (NAS) paradigm, which first trains a supernet that incorporates all candidate networks and then applies a well-known evolutionary algorithm NSGA-II to discover more efficient networks with higher accuracy and lower computational cost. Experimental results demonstrate that our searched network is competent in ASC tasks, which achieves 90.3% F1-score on the DCASE2018 task 5 evaluation set, marking a new state-of-the-art performance while saving 25% of FLOPs compared to our baseline network.