New insights into training dynamics of deep classifiers