Rethinking Conventional Wisdom in Machine Learning: From Generalization to Scaling