GroSS: Group-Size Series Decomposition for Whole Search-Space Training
Howard-Jenkins, Henry, Li, Yiwen, Prisacariu, Victor A.
GroSS allows for dynamic and differentiable selection of factorisation rank, which is analogous to a grouped convolution. Therefore, to the best of our knowledge, GroSS is the first method to simultaneously train differing numbers of groups within a single layer, as well as all possible combinations between layers. In doing so, GroSS trains an entire grouped convolution architecture search-space concurrently. We demonstrate this through proof-of-concept architecture searches with performance objectives. GroSS represents a significant step towards liberating network architecture search from the burden of training and fine-tuning. Generally, these methods have usually involved careful network design, often relying on domain knowledge to design a structure which can encapsulate the task at hand. Neural Architecture Search (NAS) has provided an alternative to hand designed networks, allowing for the search and even direct optimisation of the network's structure. But, the search space for architectures is often vast, with potentially limitless design choices. Furthermore, each configuration must undergo some training or fine-tuning for its efficacy to be determined.
Dec-2-2019
- Country:
- North America > Canada
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- Genre:
- Research Report (0.40)
- Technology: