Collaborating Authors

Confusion Matrix


In some of my previous blogs I have discussed different machine learning algorithms. Using those algorithms we can build our models. We do data cleansing, pre-processing and then pass the data into our model. The model does the prediction. How to know if the model is good or bad.

Intuitively understand ROC and implement it in R and Python


The field of machine learning can broadly be categorised into supervised learning, unsupervised learning, and reinforcement learning. Supervised learning uses previous examples with known outputs to determine an appropriate mathematical function to solve a classification or a regression problem. This post focusses on ROC (Receiver Operating Characteristics) curve that is widely used in the machine learning community to assess the performance of a classification algorithm. This post will help you intuitively understand what an ROC curve is and help you implement it in both R and Python. This article is divided into four parts, each dealing with an objective stated above.

Analyzing the Performance of the Classification Models in Machine Learning


Confusion matrix (also called Error matrix) is used to analyze how well the Classification Models (like Logistic Regression, Decision Tree Classifier, etc.) performs. Why do we analyze the performance of the models? Analyzing the performance of the models helps us to find and eliminate the bias and variance problem if exist and it also helps us to fine-tune the model so that the model produces more accurate results. Confusion Matrix is usually applied to Binary classification problems but can be extended to Multi-class classification problems as well. Concepts are comprehended better when illustrated with examples so let us consider an example.

ROC Curve and AUC -- Explained


ROC (receiver operating characteristics) curve and AOC (area under the curve) are performance measures that provide a comprehensive evaluation of classification models. AUC turns the ROC curve into a numeric representation of performance for a binary classifier. AUC is the area under the ROC curve and takes a value between 0 and 1. AUC indicates how successful a model is at separating positive and negative classes. Before going in detail, let's first explain the confusion matrix and how different threshold values change the outcome of it. A confusion matrix is not a metric to evaluate a model, but it provides insight into the predictions.

Model Evaluation Metrics in Machine Learning - KDnuggets


Predictive models have become a trusted advisor to many businesses and for a good reason. These models can "foresee the future", and there are many different methods available, meaning any industry can find one that fits their particular challenges. When we talk about predictive models, we are talking either about a regression model (continuous output) or a classification model (nominal or binary output). While data preparation and training a machine learning model is a key step in the machine learning pipeline, it's equally important to measure the performance of this trained model. How well the model generalizes on the unseen data is what defines adaptive vs non-adaptive machine learning models.