Modular Networks: Learning to Decompose Neural Computation

Louis Kirsch, Julius Kunze, David Barber

Neural Information Processing Systems 

Scaling model capacity has been vital in the success of deep learning.