Training Neural Networks at Any Scale