AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Transformation Forests

arXiv.org Machine LearningJan-8-2018

Regression models for supervised learning problems with a continuous target are commonly understood as models for the conditional mean of the target given predictors. This notion is simple and therefore appealing for interpretation and visualisation. Information about the whole underlying conditional distribution is, however, not available from these models. A more general understanding of regression models as models for conditional distributions allows much broader inference from such models, for example the computation of prediction intervals. Several random forest-type algorithms aim at estimating conditional distributions, most prominently quantile regression forests (Meinshausen, 2006, JMLR). We propose a novel approach based on a parametric family of distributions characterised by their transformation function. A dedicated novel "transformation tree" algorithm able to detect distributional changes is developed. Based on these transformation trees, we introduce "transformation forests" as an adaptive local likelihood estimator of conditional distribution functions. The resulting models are fully parametric yet very general and allow broad inference procedures, such as the model-based bootstrap, to be applied in a straightforward way.

artificial intelligence, machine learning, transformation tree, (19 more...)

arXiv.org Machine Learning

1701.0211

Country:

North America > United States > California (0.46)
Europe > Austria (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

Intuition behind Bias-Variance trade-off, Lasso and Ridge Regression

@machinelearnbotJan-7-2018, 01:48:58 GMT

We can see that Ridge and Lasso is performing far better than Linear Regression when the correlation exists in the dataset. We can tune our penalty parameter further and try to find best value of RMSE. Here, 0.01 is the best value I got for lambda. So, we can use these regression methods when the variables are highly correlated. Hope this article was useful in understanding Bias-Variance trade-off, Lasso and Ridge Regression. Please feel free to comment, give feedback and share this article if you found it useful. You can find the original article on my blog here.

artificial intelligence, machine learning, regression, (14 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Choosing the Correct Type of Regression Analysis

@machinelearnbotJan-6-2018, 21:58:35 GMT

Regression analysis mathematically describes the relationship between a set of independent variables and a dependent variable. There are numerous types of regression models that you can use. This choice often depends on the kind of data you have for the dependent variable and the type of model that provides the best fit. In this post, I cover the more common types of regression analyses and how to decide which one is right for your data. I'll provide an overview along with information to help you choose.

artificial intelligence, machine learning, regression, (17 more...)

@machinelearnbot

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

A Complete Guide to Build Better Predictive Models using Segmentation

@machinelearnbotJan-6-2018, 14:51:38 GMT

We use linear or logistic regression technique for developing accurate models for predicting an outcome of interest. Often, we create separate models for separate segments. To judge their effectiveness, we even make use of segmentation methods such as CHAID or CRT.

artificial intelligence, machine learning, segmentation, (19 more...)

@machinelearnbot

Genre: Research Report > Experimental Study (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Modeling & Simulation (1.00)

Add feedback

Machine Learning Algorithms: Which One to Choose for Your Problem

#artificialintelligenceJan-5-2018, 20:12:42 GMT

Supervised learning is the task of inferring a function from labeled training data. By fitting to the labeled training set, we want to find the most optimal model parameters to predict unknown labels on other objects (test set). If the label is a real number, we call the task regression. If the label is from the limited number of values, where these values are unordered, then it's classification. In unsupervised learning we have less information about objects, in particular, the train set is unlabeled.

algorithm, artificial intelligence, machine learning, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.33)

Add feedback

You have created your first Linear Regression Model. Have you validated the assumptions?

@machinelearnbotJan-5-2018, 18:09:00 GMT

With the dawn of the age of Data Science, there is an increased interest in learning and applying algorithms, not just by business analysts or data scientists, but by several other professionals whose core job may not be crunching data or building models. Good sign, indeed, if one understands the when, why and how of applying these fantastic techniques. If your scatterplot shows curvilinear relationship, keep in mind that higher order polynomials (2 or above) may do a better job at modelling the data. Compare models, statistics and decide for yourself which model best explains your data. For validity of a LR model, the VIF (Variance Inflationary Factor) should not be too high. How high is too high?

artificial intelligence, assumption, machine learning, (8 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
Information Technology > Data Science (0.92)

Add feedback

Adversarial Perturbation Intensity Achieving Chosen Intra-Technique Transferability Level for Logistic Regression

Gubri, Martin

arXiv.org Machine LearningJan-5-2018

Machine Learning models have been shown to be vulnerable to adversarial examples, ie. the manipulation of data by a attacker to defeat a defender's classifier at test time. We present a novel probabilistic definition of adversarial examples in perfect or limited knowledge setting using prior probability distributions on the defender's classifier. Using the asymptotic properties of the logistic regression, we derive a closed-form expression of the intensity of any adversarial perturbation, in order to achieve a given expected misclassification rate. This technique is relevant in a threat model of known model specifications and unknown training data. To our knowledge, this is the first method that allows an attacker to directly choose the probability of attack success. We evaluate our approach on two real-world datasets.

adversarial example, artificial intelligence, machine learning, (12 more...)

arXiv.org Machine Learning

1801.01953

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.54)
Research Report > Experimental Study (0.40)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Program Evaluation and Causal Inference with High-Dimensional Data

Belloni, Alexandre, Chernozhukov, Victor, Fernández-Val, Ivan, Hansen, Christian

arXiv.org Machine LearningJan-5-2018

In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets.

artificial intelligence, machine learning, selection selection dollar, (19 more...)

arXiv.org Machine Learning

1311.2645

Country: North America > United States > Massachusetts (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry: Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Python Overtaking R?

@machinelearnbotJan-4-2018, 21:15:50 GMT

I just read two articles that claim that Python is overtaking R for data science and machine learning. From user comments, I learned that R is still strong in certain tasks. I will survey what these tasks are. The first article by Vincent Granville from DSC uses proxy metrics (as opposed to asking the users). He uses statistics from Google Trends, Indeed job search terms, and Analytic Talent (DSC job database) to conclude that Python has overtaken R. One is led to ask if one group of users (say Python's) is a more active googler.

data science community, kdnugget, python, (9 more...)

@machinelearnbot

Country: North America > United States (0.05)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Installation Quickstart for Azure Machine Learning services

#artificialintelligenceJan-3-2018, 17:33:36 GMT

Azure Machine Learning services (preview) is an integrated, end-to-end data science and advanced analytics solution. It helps professional data scientists to prepare data, develop experiments, and deploy models at cloud scale. This Quickstart shows you how to create experimentation and model management accounts in Azure Machine Learning Preview. It also shows you how to install the Azure Machine Learning Workbench desktop application and CLI tools. Next, you take a quick tour of Azure Machine Learning Preview features by using the Iris flower dataset to build a model that predicts the type of iris based on some of its physical characteristics.

artificial intelligence, machine learning, machine learning workbench, (13 more...)

#artificialintelligence

Genre: Instructional Material (0.35)

Industry: Education (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback