A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice