Goto

Collaborating Authors

 Regression


Regression Analysis for Statistics & Machine Learning in R

@machinelearnbot

It is a practical, hands-on course, i.e. we will spend some time dealing with some of the theoretical concepts related to both statistical and machine learning regression analysis. However, majority of the course will focus on implementing different techniques on real data and interpret the results. After each video you will learn a new concept or technique which you may apply to your own projects.


Gap Safe screening rules for sparsity enforcing penalties

arXiv.org Machine Learning

In high dimensional regression settings, sparsity enforcing penalties have proved useful to regularize the data-fitting term. A recently introduced technique called screening rules propose to ignore some variables in the optimization leveraging the expected sparsity of the solutions and consequently leading to faster solvers. When the procedure is guaranteed not to discard variables wrongly the rules are said to be safe. In this work, we propose a unifying framework for generalized linear models regularized with standard sparsity enforcing penalties such as $\ell_1$ or $\ell_1/\ell_2$ norms. Our technique allows to discard safely more variables than previously considered safe rules, particularly for low regularization parameters. Our proposed Gap Safe rules (so called because they rely on duality gap computation) can cope with any iterative solver but are particularly well suited to (block) coordinate descent methods. Applied to many standard learning tasks, Lasso, Sparse-Group Lasso, multi-task Lasso, binary and multinomial logistic regression, etc., we report significant speed-ups compared to previously proposed safe rules on all tested data sets.


Simple Linear Regression Basics using R Udemy

@machinelearnbot

This is an introductory course to Simple Linear Regression, one of the most basic courses in Statistics. This course is most suitable for students of professionals with basic knowledge or beginning level in Statistics and R coding. It also fits for anybody who want to explore the field of statistics using R in any discipline. We will start with an introduction section where I will explain the regression equation with a detailed example. Next, we will cover the essentials of modeling.


Applied Machine Learning in R Udemy

@machinelearnbot

They are powerful data mining techniques that allow you to detect patterns in your data or variables. For each technique, a number of practical exercises are proposed. By doing these exercises you'll actually apply in practice what you have learned. This course is your opportunity to become a machine learning expert in a few weeks only! With my video lectures, you will find it very easy to master the major machine learning techniques. Everything is shown live, step by step, so you can replicate any procedure at any time you need it. So click the "Enroll" button to get instant access to your machine learning course. It will surely provide you with new priceless skills. And, who knows, it could give you a tremendous career boost in the near future.


A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logistic Regression

arXiv.org Artificial Intelligence

In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, which is expressed with the generalized Lambert W function. Experiments carried out on standard datasets demonstrate the efficiency of our approach w.r.t. stochastic gradient-like methods.


How to choose machine learning algorithms

#artificialintelligence

The answer to the question "What machine learning algorithm should I use?" is always "It depends." It depends on the size, quality, and nature of the data. It depends on what you want to do with the answer. It depends on how the math of the algorithm was translated into instructions for the computer you are using. And it depends on how much time you have. Even the most experienced data scientists can't tell which algorithm will perform best before trying them.


Optimal Weighting for Exam Composition

arXiv.org Artificial Intelligence

A problem faced by many instructors is that of designing exams that accurately assess the abilities of the students. Typically these exams are prepared several days in advance, and generic question scores are used based on rough approximation of the question difficulty and length. For example, for a recent class taught by the author, there were 30 multiple choice questions worth 3 points, 15 true/false with explanation questions worth 4 points, and 5 analytical exercises worth 10 points. We describe a novel framework where algorithms from machine learning are used to modify the exam question weights in order to optimize the exam scores, using the overall class grade as a proxy for a student's true ability. We show that significant error reduction can be obtained by our approach over standard weighting schemes, and we make several new observations regarding the properties of the "good" and "bad" exam questions that can have impact on the design of improved future evaluation methods.


Kernel Regression with Sparse Metric Learning

arXiv.org Machine Learning

Kernel regression is a popular non-parametric fitting technique. It aims at learning a function which estimates the targets for test inputs as precise as possible. Generally, the function value for a test input is estimated by a weighted average of the surrounding training examples. The weights are typically computed by a distance-based kernel function and they strongly depend on the distances between examples. In this paper, we first review the latest developments of sparse metric learning and kernel regression. Then a novel kernel regression method involving sparse metric learning, which is called kernel regression with sparse metric learning (KR$\_$SML), is proposed. The sparse kernel regression model is established by enforcing a mixed $(2,1)$-norm regularization over the metric matrix. It learns a Mahalanobis distance metric by a gradient descent procedure, which can simultaneously conduct dimensionality reduction and lead to good prediction results. Our work is the first to combine kernel regression with sparse metric learning. To verify the effectiveness of the proposed method, it is evaluated on 19 data sets for regression. Furthermore, the new method is also applied to solving practical problems of forecasting short-term traffic flows. In the end, we compare the proposed method with other three related kernel regression methods on all test data sets under two criterions. Experimental results show that the proposed method is much more competitive.


Network-Scale Traffic Modeling and Forecasting with Graphical Lasso and Neural Networks

arXiv.org Machine Learning

Department of Computer Science and Technology, East China Normal University, 500 Dongchuan Road, Shanghai 200241, China Abstract Traffic flow forecasting, especially the short-term case, is an important topic in intelligent transportation systems (ITS). This paper does a lot of research on networkscale modeling and forecasting of short-term traffic flows. Firstly, we propose the concepts of single-link and multi-link models of traffic flow forecasting. Secondly, we construct four prediction models by combining the two models with singletask learning and multi-task learning. The combination of the multi-link model and multi-task learning not only improves the experimental efficiency but also the prediction accuracy. Moreover, a new multi-link single-task approach that combines graphical lasso (GL) with neural network (NN) is proposed. GL provides a general methodology for solving problems involving lots of variables. Using L1 regularization, GL builds a sparse graphical model making use of the sparse inverse covariance matrix. In addition, Gaussian process regression (GPR) is a classic regression algorithm in Bayesian machine learning. Although there is wide research on GPR, there are few applications of GPR in traffic flow forecasting. In this paper, we apply GPR to traffic flow forecasting and show its potential. Through sufficient experiments, we compare all of the proposed approaches and make an overall assessment at last. Introduction With the accelerated pace of modern life, more and more cars come into use.


Multiple Regression Analysis with R Udemy

@machinelearnbot

It explores main concepts from basic to expert level which can help you achieve better grades, develop your academic career, apply your knowledge at work or make business forecasting related decisions. Learning multiple regression analysis is indispensable for business analysis, financial analysis or data science applications in areas such as consumer analytics, finance, banking, health care, science, e-commerce and social media. It is also essential for academic careers in data science, applied statistics, economics, econometrics or quantitative finance. And it is necessary for any business forecasting related decision. But as learning curve can become steep as complexity grows, this course helps by leading you through step by step real world practical examples for greater effectiveness.