Uses and Abuses of the Cross-Entropy Loss: Case Studies in Modern Deep Learning