The neural architecture search (NAS) algorithm with reinforcement learning can be a powerful and novel framework for the automatic discovering process of neural architectures. However, its application is restricted by noncontinuous and high-dimensional search spaces, which result in difficulty in optimization. To resolve these problems, we proposed NAS in embedding space (NASES), which is a novel framework. Unlike other NAS with reinforcement learning approaches that search over a discrete and high-dimensional architecture space, this approach enables reinforcement learning to search in an embedding space by using architecture encoders and decoders. The current experiment demonstrated that the performance of the final architecture network using the NASES procedure is comparable with that of other popular NAS approaches for the image classification task on CIFAR-10. The beneficial-performance and effectiveness of NASES was impressive even when only the architecture-embedding searching and pre-training controller were applied without other NAS tricks such as parameter sharing. Specifically, considerable reduction in searches was achieved by reducing the average number of searching to 100 architectures to achieve a final architecture for the NASES procedure. Introduction Deep neural networks have enabled advances in image recognition, sequential pattern recognition, recommendation systems, and various tasks in the past decades.
A neural architecture, which is the structure and connectivity of the network, is typically either hand-crafted or searched by optimizing some specific objective criterion (e.g., classification accuracy). Since the space of all neural architectures is huge, search methods are usually heuristic and do not guarantee finding the optimal architecture, with respect to the objective criterion. In addition, these search methods might require a large number of supervised training iterations and use a high amount of computational resources, rendering the solution infeasible for many applications. Moreover, optimizing for a specific criterion might result in a model that is suboptimal for other useful criteria such as model size, representation of uncertainty and robustness to adversarial attacks. Thus, the resulting architectures of most strategies used today, whether hand crafting or heuristic searches, are densely connected networks, which are not an optimal solution for the objective they were created to achieve, let alone other objectives.
Let's talk about Meta-Learning because this is one confusing topic. I wrote a previous post about Deconstructing Meta-Learning which explored "Learning to Learn". I realized thought that there is another kind of Meta-Learning that practitioners are more familiar with. This kind of Meta-Learning can be understood as algorithms the search and select different DL architectures. Hyper-parameter optimization is an instance of this, however there are another more elaborate algorithms that follow the same prescription of searching for architectures.