Learning Architectures from an Extended Search Space for Language Modeling
Li, Yinqiao, Hu, Chi, Zhang, Yuhao, Xu, Nuo, Jiang, Yufan, Xiao, Tong, Zhu, Jingbo, Liu, Tongran, Li, Changliang
Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.
Jun-5-2020
- Country:
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Victoria > Melbourne (0.04)
- North America
- United States
- Utah > Salt Lake County
- Salt Lake City (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.05)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County > Long Beach (0.04)
- Utah > Salt Lake County
- Canada > British Columbia
- United States
- Europe
- Asia
- Middle East
- Japan > Honshū
- Kantō > Chiba Prefecture > Chiba (0.04)
- China
- Beijing > Beijing (0.04)
- Liaoning Province > Shenyang (0.04)
- Hong Kong (0.04)
- Oceania > Australia
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology (0.46)
- Technology: