Neural Architecture Search in Embedding Space

arXiv.org Machine Learning

The neural architecture search (NAS) algorithm with reinforcement learning can be a powerful and novel framework for the automatic discovering process of neural architectures. However, its application is restricted by noncontinuous and high-dimensional search spaces, which result in difficulty in optimization. To resolve these problems, we proposed NAS in embedding space (NASES), which is a novel framework. Unlike other NAS with reinforcement learning approaches that search over a discrete and high-dimensional architecture space, this approach enables reinforcement learning to search in an embedding space by using architecture encoders and decoders. The current experiment demonstrated that the performance of the final architecture network using the NASES procedure is comparable with that of other popular NAS approaches for the image classification task on CIFAR-10. The beneficial-performance and effectiveness of NASES was impressive even when only the architecture-embedding searching and pre-training controller were applied without other NAS tricks such as parameter sharing. Specifically, considerable reduction in searches was achieved by reducing the average number of searching to 100 architectures to achieve a final architecture for the NASES procedure. Introduction Deep neural networks have enabled advances in image recognition, sequential pattern recognition, recommendation systems, and various tasks in the past decades.



ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

arXiv.org Machine Learning

Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. $10^4$ GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale target tasks and target hardware platforms. We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set. Experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of directness and specialization. On CIFAR-10, our model achieves 2.08\% test error with only 5.7M parameters, better than the previous state-of-the-art architecture AmoebaNet-B, while using 6$\times$ fewer parameters. On ImageNet, our model achieves 3.1\% better top-1 accuracy than MobileNetV2, while being 1.2$\times$ faster with measured GPU latency. We also apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.


Principled Neural Architecture Learning - Intel AI

#artificialintelligence

A neural architecture, which is the structure and connectivity of the network, is typically either hand-crafted or searched by optimizing some specific objective criterion (e.g., classification accuracy). Since the space of all neural architectures is huge, search methods are usually heuristic and do not guarantee finding the optimal architecture, with respect to the objective criterion. In addition, these search methods might require a large number of supervised training iterations and use a high amount of computational resources, rendering the solution infeasible for many applications. Moreover, optimizing for a specific criterion might result in a model that is suboptimal for other useful criteria such as model size, representation of uncertainty and robustness to adversarial attacks. Thus, the resulting architectures of most strategies used today, whether hand crafting or heuristic searches, are densely connected networks, which are not an optimal solution for the objective they were created to achieve, let alone other objectives.


Deep Neural Network Architectures for Modulation Classification

arXiv.org Machine Learning

In this work, we investigate the value of employing deep learning for the task of wireless signal modulation recognition. Recently in [1], a framework has been introduced by generating a dataset using GNU radio that mimics the imperfections in a real wireless channel, and uses 10 different modulation types. Further, a convolutional neural network (CNN) architecture was developed and shown to deliver performance that exceeds that of expert-based approaches. Here, we follow the framework of [1] and find deep neural network architectures that deliver higher accuracy than the state of the art. We tested the architecture of [1] and found it to achieve an accuracy of approximately 75% of correctly recognizing the modulation type. We first tune the CNN architecture of [1] and find a design with four convolutional layers and two dense layers that gives an accuracy of approximately 83.8% at high SNR. We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet [2]) and Densely Connected Networks (DenseNet [3]) to achieve high SNR accuracies of approximately 83.5% and 86.6%, respectively. Finally, we introduce a Convolutional Long Short-term Deep Neural Network (CLDNN [4]) to achieve an accuracy of approximately 88.5% at high SNR.