No Reason for No Supervision: Improved Generalization in Supervised Models