Collaborating Authors

The 6 Metrics You Need to Optimize for Performance in Machine Learning


There are many metrics to measure the performance of your model depending on the type of machine learning you are looking to conduct. In this article, we take a look at performance measures for classification and regression models and discuss which is better optimized. Sometimes the metric to look at will vary according to the problem that is initially being solved. The True Positive Rate also called Recall is the go-to performance measure in binary/non-binary classification problems. Most if not all the time, we are only interested in correctly predicting one class.

Label Embedding Trees for Large Multi-Class Tasks

Neural Information Processing Systems

Multi-class classification becomes challenging at test time when the number of classes is very large and testing against every possible class can become computationally infeasible. This problem can be alleviated by imposing (or learning) a structure over the set of classes. We propose an algorithm for learning a tree-structure of classifiers which, by optimizing the overall tree loss, provides superior accuracy to existing tree labeling methods. We also propose a method that learns to embed labels in a low dimensional space that is faster than non-embedding approaches and has superior accuracy to existing embedding approaches. Finally we combine the two ideas resulting in the label embedding tree that outperforms alternative methods including One-vs-Rest while being orders of magnitude faster.

How to evaluate a machine learning model - part 3


This blog post is the continuation of my previous articles part 1 and part 2. The average per-class accuracy is a variation of accuracy. It is defined as the average of the accuracy for each individual class. Accuracy is an example of what is known as a micro-average, while average per-class accuracy is a macro-average. In general, when there are different numbers of examples per class, the average per-class accuracy will be different from the accuracy. Why this is important is because when the classes are imbalanced, i.e., there are a lot more examples of one class than of the other, and then the accuracy will give an imprecise picture as the class with more examples will dominate the statistic.

Why the high accuracy in classification is not always correct?


Classification accuracy is a statistic that describes a classification model's performance by dividing the number of correct predictions by the total number of predictions. It is simple to compute and comprehend, making it the most often used statistic for assessing classifier models. But not in every scenario accuracy score is to be considered the best metric to evaluate the model. In this article, we will discuss the reasons not to believe in the accuracy performance parameter completely. Following are the topics to be covered.

Incremental Learning for Metric-Based Meta-Learners Machine Learning

Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large dataset and a testing phase, where the meta-learner leverages its learnt internal representation for a specific few-shot task involving classes which were not seen during the meta-training phase. To the best of our knowledge, all such meta-learning methods use a single base dataset for meta-training to sample tasks from and do not adapt the algorithm after meta-training. This strategy may not scale to real-world use-cases where the meta-learner does not potentially have access to the full meta-training dataset from the very beginning and we need to update the meta-learner in an incremental fashion when additional training data becomes available. Through our experimental setup, we develop a notion of incremental learning during the meta-training phase of meta-learning and propose a method which can be used with multiple existing metric-based meta-learning algorithms. Experimental results on benchmark dataset show that our approach performs favorably at test time as compared to training a model with the full meta-training set and incurs negligible amount of catastrophic forgetting