Scaling up deep neural networks: a capacity allocation perspective