Regression
Advantages and Disadvantages of Logistic Regression
Logistic Regression is one of the supervised Machine Learning algorithms used for classification i.e. to predict discrete valued outcome. It is a statistical approach that is used to predict the outcome of a dependent variable based on observations given in the training set. Logistic Regression is one of the simplest machine learning algorithms and is easy to implement yet provides great training efficiency in some cases. Also due to these reasons, training a model with this algorithm doesn't require high computation power. The predicted parameters (trained weights) give inference about the importance of each feature.
Reducibility and Statistical-Computational Gaps from Secret Leakage
Brennan, Matthew, Bresler, Guy
Inference problems with conjectured statistical-computational gaps are ubiquitous throughout modern statistics, computer science and statistical physics. While there has been success evidencing these gaps from the failure of restricted classes of algorithms, progress towards a more traditional reduction-based approach to computational complexity in statistical inference has been limited. Existing reductions have largely been limited to inference problems with similar structure -- primarily mapping among problems representable as a sparse submatrix signal plus a noise matrix, which are similar to the common hardness assumption of planted clique. The insight in this work is that a slight generalization of the planted clique conjecture -- secret leakage planted clique -- gives rise to a variety of new average-case reduction techniques, yielding a web of reductions among problems with very different structure. Using variants of the planted clique conjecture for specific forms of secret leakage planted clique, we deduce tight statistical-computational tradeoffs for a diverse range of problems including robust sparse mean estimation, mixtures of sparse linear regressions, robust sparse linear regression, tensor PCA, variants of dense $k$-block stochastic block models, negatively correlated sparse PCA, semirandom planted dense subgraph, detection in hidden partition models and a universality principle for learning sparse mixtures. In particular, a $k$-partite hypergraph variant of the planted clique conjecture is sufficient to establish all of our computational lower bounds. Our techniques also reveal novel connections to combinatorial designs and to random matrix theory. This work gives the first evidence that an expanded set of hardness assumptions, such as for secret leakage planted clique, may be a key first step towards a more complete theory of reductions among statistical problems.
How Much Math do you need in Data Science? - KDnuggets
Can I become a data scientist with little or no math background? What essential math skills are important in data science? There are so many good packages that can be used for building predictive models or for producing data visualizations. Thanks to these packages, anyone can build a model or produce a data visualization. However, very solid background knowledge in mathematics is essential for fine-tuning your models to produce reliable models with optimal performance.
Machine Learning Regression Masterclass in Python
Artificial Intelligence (AI) revolution is here! The technology is progressing at a massive scale and is being widely adopted in the Healthcare, defense, banking, gaming, transportation and robotics industries. Machine Learning is a subfield of Artificial Intelligence that enables machines to improve at a given task with experience. Machine Learning is an extremely hot topic; the demand for experienced machine learning engineers and data scientists has been steadily growing in the past 5 years. According to a report released by Research and Markets, the global AI and machine learning technology sectors are expected to grow from $1.4B to $8.8B by 2022 and it is predicted that AI tech sector will create around 2.3 million jobs by 2020.
The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R
Li, Xingguo, Zhao, Tuo, Yuan, Xiaoming, Liu, Han
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, $\ell_q$ Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). These methods exploit different nonsmooth loss functions to gain modeling flexibility, estimation robustness, and tuning insensitiveness. The developed solver is based on the alternating direction method of multipliers (ADMM). The package flare is coded in double precision C, and called from R by a user-friendly interface. The memory usage is optimized by using the sparse matrix output. The experiments show that flare is efficient and can scale up to large problems.
Normal Equation in Linear Regression
Gradient descent is a very popular and first-order iterative optimization algorithm for finding a local minimum over a differential function. Similarly, the Normal Equation is another way of doing minimization. It does minimization without restoring to an iterative algorithm. Here, the relationship between the Number of Rooms, and the Price of the House, appears to be Linear. Here, the predictions from the Normal Equation and Linear Equation are the same.
The STATA OMNIBUS: Regression and Modelling with STATA
The STATA OMNIBUS: Regression and Modelling with STATA 4.5 (5 ratings) Course Ratings are calculated from individual students' ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately. Learn everything you need to know about linear regression, non-linear regression, regression modelling and STATA in one package. Learning and applying new statistical techniques can often be a daunting experience. "Easy Statistics" is designed to provide you with a compact, and easy to understand, course that focuses on the basic principles of statistical methodology. This course will focus on the concept of linear regression and non-linear regression.
Stochastic Online Optimization using Kalman Recursion
de Vilmarest, Joseph, Wintenberger, Olivier
We study the Extended Kalman Filter in constant dynamics, offering a bayesian perspective of stochastic optimization. We obtain high probability bounds on the cumulative excess risk in an unconstrained setting. In order to avoid any projection step we propose a two-phase analysis. First, for linear and logistic regressions, we prove that the algorithm enters a local phase where the estimate stays in a small region around the optimum. We provide explicit bounds with high probability on this convergence time. Second, for generalized linear regressions, we provide a martingale analysis of the excess risk in the local phase, improving existing ones in bounded stochastic optimization. The EKF appears as a parameter-free online algorithm with O(d^2) cost per iteration that optimally solves some unconstrained optimization problems.
Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python
Ge, Jason, Li, Xingguo, Jiang, Haoming, Liu, Han, Zhang, Tong, Wang, Mengdi, Zhao, Tuo
We describe a new library named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (e.g., sparse linear regression, sparse logistic regression, sparse Poisson regression and scaled sparse linear regression) combined with efficient active set selection strategies. Besides, the library allows users to choose different sparsity-inducing regularizers, including the convex $\ell_1$, nonconvex MCP and SCAD regularizers. The library is coded in C++ and has user-friendly R and Python wrappers. Numerical experiments demonstrate that picasso can scale up to large problems efficiently.
Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic Bounds and Applications
Ausset, Guillaume, Clémençon, Stephan, Portier, François
Motivated by a wide variety of applications, ranging from stochastic optimization to dimension reduction through variable selection, the problem of estimating gradients accurately is of crucial importance in statistics and learning theory. We consider here the classic regression setup, where a real valued square integrable r.v. Y is to be predicted upon observing a (possibly high dimensional) random vector X by means of a predictive function f(X) as accurately as possible in the mean-squared sense and study a nearest-neighbour-based pointwise estimate of the gradient of the optimal predictive function, the regression function m(x) E[Y X x]. Under classic smoothness conditions combined with the assumption that the tails of Y m(X) are sub-Gaussian, we prove nonasymptotic bounds improving upon those obtained for alternative estimation methods. Beyond the novel theoretical results established, several illustrative numerical experiments have been carried out. The latter provide strong empirical evidence that the estimation method proposed works very well for various statistical problems involving gradient estimation, namely dimensionality reduction, stochastic gradient descent optimization and quantifying disentanglement.