Understanding Square Loss in Training Overparametrized Neural Network Classifiers