Why do We use Cross-entropy in Deep Learning -- Part 2