Goto

Collaborating Authors

 Regression


Locally Weighted Regression

@machinelearnbot

A couple of weeks back, I started a review of the linear models I've used over the years and and I realized that I never really understood how the locally weighted regression algorithm works. This and the fact that sklearn had no support for it, encouraged me to do an investigation into the working principles of the algorithm. In this post, I would attempt to provide an overview of the algorithm using mathematical inference and list some of the implementations available in Python. Regression is the estimation of a continuous response variable based on the values of some other variable. The variable to be estimated is dependent on the other variable(s) in the function space.


Solving A Simple Classification Problem with Python -- Fruits Lovers' Edition

@machinelearnbot

In this post, we'll implement several machine learning algorithms in Python using Scikit-learn, the most popular machine learning tool for Python. Using a simple dataset for the task of training a classifier to distinguish between different types of fruits. The purpose of this post is to identify the machine learning algorithm that is best-suited for the problem at hand; thus, we want to compare different algorithms, selecting the best-performing one. The fruits dataset was created by Dr. Iain Murray from University of Edinburgh. He bought a few dozen oranges, lemons and apples of different varieties, and recorded their measurements in a table.


Is Medicine Mesmerized by Machine Learning? Statistical Thinking

#artificialintelligence

BD Horne et al wrote an important paper Exceptional mortality prediction by risk scores from common laboratory tests that apparently garnered little attention, perhaps because it used older technology: standard clinical lab tests and logistic regression. Yet even putting themselves at a significant predictive disadvantage by binning all the continuous lab values into fifths, the authors were able to achieve a validated c-index (AUROC) of 0.87 in predicting death within 30d in a mixed inpatient, outpatient, and emergency department patient population. Their model also predicted 1y and 5y mortality very well, and performed well in a completely independent NHANES cohort1. It also performed very well when evaluated just in outpatients, a group with very low mortality. The above model, called by the authors the Intermountain Risk Score, used the following predictors: age, sex, hematocrit, hemoglobin, red cell distribution width, mean corpuscular volume, red blood cell count, platelet count, mean platelet volume, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, total white blood count, sodium, potassium, chloride, bicarbonate, calcium, glucose, creatinine, and BUN2.


Create your Machine Learning library from scratch with R ! (1/3) - Enhance Data Science

#artificialintelligence

When dealing with Machine Learning problems in R, most of the time you rely on already existing libraries. This fastens the analysis process, but do you really understand what is behind the algorithms? Could you implement a logistic regression from scratch with R? The goal of this post is to create our own basic machine learning library from scratch with R. We will only use the linear algebra tools available in R. The goal of liner regression is to estimate a continuous variable given a matrix of observations . Before dealing with the code, we need to derive the solution of the linear regression. Given a matrix of observations and the target .


Comparison between Ridge ,linear and lasso regression

@machinelearnbot

It is one of the most widely known modeling technique. Linear regression is usually among the first few topics which people pick while learning predictive modeling. In this technique, the dependent variable is continuous, independent variable(s) can be continuous or discrete, and nature of regression line is linear. Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line (also known as regression line). Ridge Regression is a technique used when the data suffers from multicollinearity ( independent variables are highly correlated).


Linearized Binary Regression

arXiv.org Machine Learning

Probit regression was first proposed by Bliss in 1934 to study mortality rates of insects. Since then, an extensive body of work has analyzed and used probit or related binary regression methods (such as logistic regression) in numerous applications and fields. This paper provides a fresh angle to such well-established binary regression methods. Concretely, we demonstrate that linearizing the probit model in combination with linear estimators performs on par with state-of-the-art nonlinear regression methods, such as posterior mean or maximum aposteriori estimation, for a broad range of real-world regression problems. We derive exact, closed-form, and nonasymptotic expressions for the mean-squared error of our linearized estimators, which clearly separates them from nonlinear regression methods that are typically difficult to analyze. We showcase the efficacy of our methods and results for a number of synthetic and real-world datasets, which demonstrates that linearized binary regression finds potential use in a variety of inference, estimation, signal processing, and machine learning applications that deal with binary-valued observations or measurements.


Multiple Linear regression - Transformation on Percentage regressors

@machinelearnbot

I'm not aware of a requirement to square a percentage variable. If I assume that the percentage is independent (i.e. it isn't a percentage of one of the other variables or of the dependent variable), then using a percent (0 to 100) is a simple linear transformation of the values of the orignal parameter, therefore there is no reason to square it. However, consider that your goal is to best model the dependent variable. If you find a transformation of one of the independent variables improves the results, then you should consider using it. Just be careful when using the model to ensure you properly transform the independent data.


Removing eye activity from EEG signals via ICA

@machinelearnbot

In this previous post, I used linear regression to remove ocular artifacts from EEG signals. A popular alternative to this approach is independent component analysis (ICA). Components that represent ocular activity can be identified and eliminated to reconstruct artifact-free EEG signals. This approach is described in more detail in Jung et al. (2000). A comprehensive comparison between the two methods is beyond the scope of this post.


A Complete Tutorial on Ridge and Lasso Regression in Python

@machinelearnbot

When we talk about Regression, we often end up discussing Linear and Logistics Regression. Do you know there are 7 types of Regressions? Linear and logistic regression is just the most loved members from the family of regressions. Last week, I saw a recorded talk at NYC Data Science Academy from Owen Zhang, current Kaggle rank 3 and Chief Product Officer at DataRobot. He said, 'if you are using regression without regularization, you have to be very special!'. I hope you get what a person of his stature referred to. I understood it very well and decided to explore regularization techniques in detail. In this article, I have explained the complex science behind'Ridge Regression' and'Lasso Regression' which are the most fundamental regularization techniques, sadly still not used by many.


A Gaussian Process Regression Model for Distribution Inputs

arXiv.org Machine Learning

Abstract--Monge-Kantorovich distances, otherwise known as Wasserstein distances, have received a growing attention in statistics and machine learning as a powerful discrepancy measure for probability distributions. In this paper, we focus on forecasting a Gaussian process indexed by probability distributions. For this, we provide a family of positive definite kernels built using transportation based distances. We provide a probabilistic understanding of these kernels and characterize the corresponding stochastic processes. We prove that the Gaussian processes indexed by distributions corresponding to these kernels can be efficiently forecast, opening new perspectives in Gaussian process modeling. RIGINALLY used in spatial statistics (see for instance [2] and references therein), Kriging has become very popular in many fields such as machine learning or computer experiment, as described in [3]. It consists in predicting the value of a function at some point by a linear combination of observed values at different points. The unknown function is modeled as the realization of a random process, usually Gaussian, and the Kriging forecast can be seen as the posterior mean, leading to the optimal linear unbiased predictor of the random process. Gaussian process models rely on the definition of a covariance function that characterizes the correlations between values of the process at different observation points. As the notion of similarity between data points is crucial, i.e. close location inputs are likely to have similar target values, covariance functions are the key ingredient in using Gaussian processes, since they define nearness or similarity.