Breaking Neural Network Scaling Laws with Modularity

Open in new window