There are a number of popular evaluation metrics for classification other than accuracy such as recall, precision, AUC, F-scores etc. Instead of listing them all here, I think it is best to point you towards some interesting resources that can kick-start your search for answers. Although you might not be using scikit, the metrics remain relevant. It also quite lists differences between binary classification and multi-class classification setting.
The training regimen works like this: First, we input training data and have the model make a prediction using current parameter values. A comparison of that prediction is made with the correct categories, and the numerical result of this comparison is known as loss. The smaller the loss value, the closer the category predictions are to the correct categories--and conversely. The aim is to minimize the loss. But before we look at the loss minimization, let's take a look at how the loss is calculated.
Computer vision will play a crucial role in visual search, self-driving cars, medicine and many other applications. Success will hinge on collecting and labeling large labeled datasets which will be used to train and test new algorithms. One area that has seen great advances over the last five years is image classification i.e. determining automatically what objects are present in an image. Existing image classification datasets have an equal number of images for each class. However, the real world is long tailed: only a small percentage of classes are likely to be observed; most classes are infrequent or rare.
Last week I published a blog post about how easy it is to train image classification models with Keras. What I did not show in that post was how to use the model for making predictions. This, I will do here. But predictions alone are boring, so I'm adding explanations for the predictions using the lime package. I have already written a few blog posts (here, here and here) about LIME and have given talks (here and here) about it, too.