As machine learning (ML) becomes more effective and widespread it is becoming more prevalent in systems with real-life impact, from loan recommendations to job application decisions. With the growing usage comes the risk of bias – biased training data could lead to biased ML algorithms, which in turn could perpetuate discrimination and bias in society. In a new paper from Google, researchers propose a novel technique to train machine learning algorithms fairly even with a biased dataset. At the heart of the technique is the idea that a biased dataset can be perceived as an unbiased dataset which has gone through manipulation by a biased agent. Using this framework, the biased dataset is re-weighted to fit the (theoretical) unbiased dataset, and only then fed into a machine learning algorithm as training data.
Why is artificial intelligence (AI) and machine learning (ML) so important? Anyone who doesn't understand this will soon be left behind. There are many kinds of implementations and techniques that carry out AI and ML to solve real-time problems, and supervised learning is one of the most used approaches. "The key to artificial intelligence has always been the representation." In supervised learning, we start by importing a dataset containing training attributes and the target attributes.
The Learning Vector Quantization (LVQ) algorithm is a lot like k-Nearest Neighbors. Predictions are made by finding the best match among a library of patterns. The difference is that the library of patterns is learned from training data, rather than using the training patterns themselves. The library of patterns are called codebook vectors and each pattern is called a codebook. The codebook vectors are initialized to randomly selected values from the training dataset.
In this post, we are going to introduce you to the Support Vector Machine (SVM) machine learning algorithm. We will follow a similar process to our recent post Naive Bayes for Dummies; A Simple Explanation by keeping it short and not overly-technical. The aim is to give those of you who are new to machine learning a basic understanding of the key concepts of this algorithm. Support Vector Machines - What are they? A Support Vector Machine (SVM) is a supervised machine learning algorithm that can be employed for both classification and regression purposes. SVMs are more commonly used in classification problems and as such, this is what we will focus on in this post. SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes, as shown in the image below.
In this post you will discover the k-Nearest Neighbors (KNN) algorithm for classification and regression. After reading this post you will know. This post was written for developers and assumes no background in statistics or mathematics. The focus is on how the algorithm works and how to use it for predictive modeling problems. If you have any questions, leave a comment and I will do my best to answer.