Goto

Collaborating Authors

 Regression


Prediction in latent factor regression: Adaptive PCR and beyond

arXiv.org Machine Learning

This work is devoted to the finite sample prediction risk analysis of a class of linear predictors of a response $Y\in \mathbb{R}$ from a high-dimensional random vector $X\in \mathbb{R}^p$ when $(X,Y)$ follows a latent factor regression model generated by a unobservable latent vector $Z$ of dimension less than $p$. Our primary contribution is in establishing finite sample risk bounds for prediction with the ubiquitous Principal Component Regression (PCR) method, under the factor regression model, with the number of principal components adaptively selected from the data---a form of theoretical guarantee that is surprisingly lacking from the PCR literature. To accomplish this, we prove a master theorem that establishes a risk bound for a large class of predictors, including the PCR predictor as a special case. This approach has the benefit of providing a unified framework for the analysis of a wide range of linear prediction methods, under the factor regression setting. In particular, we use our main theorem to recover known risk bounds for the minimum-norm interpolating predictor, which has received renewed attention in the past two years, and a prediction method tailored to a subclass of factor regression models with identifiable parameters. This model-tailored method can be interpreted as prediction via clusters with latent centers. To address the problem of selecting among a set of candidate predictors, we analyze a simple model selection procedure based on data-splitting, providing an oracle inequality under the factor model to prove that the performance of the selected predictor is close to the optimal candidate. We conclude with a detailed simulation study to support and complement our theoretical results.


Fairwashing Explanations with Off-Manifold Detergent

arXiv.org Machine Learning

Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any classifier $g$, one can always construct another classifier $\tilde{g}$ which has the same behavior on the data (same train, validation, and test error) but has arbitrarily manipulated explanation maps. We derive this statement theoretically using differential geometry and demonstrate it experimentally for various explanation methods, architectures, and datasets. Motivated by our theoretical insights, we then propose a modification of existing explanation methods which makes them significantly more robust.


Robust Causal Inference Under Covariate Shift via Worst-Case Subpopulation Treatment Effects

arXiv.org Machine Learning

We propose the worst-case treatment effect (WTE) across all subpopulations of a given size, a conservative notion of topline treatment effect. Compared to the average treatment effect (ATE), whose validity relies on the covariate distribution of collected data, WTE is robust to unanticipated covariate shifts, and positive findings guarantee uniformly valid treatment effects over subpopulations. We develop a semiparametrically efficient estimator for the WTE, leveraging machine learning-based estimates of the heterogeneous treatment effect and propensity score. By virtue of satisfying a key (Neyman) orthogonality property, our estimator enjoys central limit behavior---oracle rates with true nuisance parameters---even when estimates of nuisance parameters converge at slower rates. For both randomized trials and observational studies, we establish a semiparametric efficiency bound, proving that our estimator achieves the optimal asymptotic variance. On real datasets where robustness to covariate shift is of core concern, we illustrate the non-robustness of ATE under even mild distributional shift, and demonstrate that the WTE guards against brittle findings that are invalidated by unanticipated covariate shifts.


COVID-19 Outbreak Prediction using Machine Learning Algorithm

#artificialintelligence

Our society is in the era of unbelievable attempts to struggle upon the spread of this life-threatening condition in terms of infrastructure, finance, business, manufacturing, and several other resources. Artificial Intelligence (AI) researchers strengthen their proficiency in developing mathematical paradigms for investigating this pandemic using nationwide distributed data. This article intends to apply the machine learning models simultaneously with the forecast of expected reachability of the COVID-19 over the nations by using the real-time data from the Johns Hopkins dashboard. Coronavirus spreads are categorized into four stages. The first stage starts with the cases recorded for the people who traveled to or from affected countries or cities, whereas in the second stage, cases are reported regionally among family, friends, and groups who came into contact with the person coming from the affected countries.



Deep Learning Foundation : Linear Regression and Statistics

#artificialintelligence

Deep Learning Foundation: Linear Regression and Statistics Udemy NED Is statistics the foundation on top of which machine learning is built? Is machine ... "Traditional" linear regression may be considered by some Machine Learning ... Highest Rated What you'll learn Linear regression statistics basics Assumptions of linear regression hypothesis testing sampling Program your own version of a linear regression model in Python Derive and solve a linear regression model, and apply it appropriately to data science problemsRequirements Jupyter notebook and simple python programmingDescription Hi Everyone welcome to new course which is created to sharpen your linear regression and statistical basics. In this course I have explained hypothesis testing, Unbiased estimators, Statistical test, Gradient descent. End of the course you will be able to code your own regression algorithm from scratch.Who this course is for: Python developers curious about data science data science and machine leaning engineers Hi Everyone welcome to new course which is created to sharpen your linear regression and statistical basics. In this course I have explained hypothesis testing, Unbiased estimators, Statistical test, Gradient descent.


Assumptions in Linear Regression you might not know.

#artificialintelligence

At first, Linear Regression is a method of modelling the best linear relationship between the independent variables and dependent variables. Linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables). The predictor variables are seen as fixed values and can be any complex function like polynomial, trigonometric, etc. But the coefficients will be strictly linear with the predictor variable. This assumption is used for implementing the Polynomial regression, which uses linear regression to fit the response variable as an arbitrary polynomial function of a predictor variable which also makes the linear relationship with the coefficients.


Every Machine Learning Algorithm Can Be Represented as a Neural Network

#artificialintelligence

It seems that all of the work in machine learning -- starting from early research in the 1950s -- cumulated with the creation of the neural network. Successively, algorithm after new algorithm were proposed, from logistic regression to support vector machines, but the neural network is, very literally, the algorithm of algorithms and the pinnacle of machine learning. It's a universal generalization of what machine learning is, instead of one attempt of doing it. In this sense, it is more of a framework and a concept than simply an algorithm, and this is evident given the massive amount of freedom in constructing neural networks -- hidden layer & node counts, activation functions, optimizers, loss functions, network types (convolutional, recurrent, etc.), and specialized layers (batch norm, dropout, etc.), to name a few. From this perspective of neural networks being a concept rather than a rigid algorithm comes a very interesting corollary: any machine learning algorithm, be it decision trees or k-nearest neighbors, can be represented using a neural network.


A Distributionally Robust Approach to Fair Classification

arXiv.org Machine Learning

We propose a distributionally robust logistic regression model with an unfairness penalty that prevents discrimination with respect to sensitive attributes such as gender or ethnicity. This model is equivalent to a tractable convex optimization problem if a Wasserstein ball centered at the empirical distribution on the training data is used to model distributional uncertainty and if a new convex unfairness measure is used to incentivize equalized opportunities. We demonstrate that the resulting classifier improves fairness at a marginal loss of predictive accuracy on both synthetic and real datasets. We also derive linear programming-based confidence bounds on the level of unfairness of any pre-trained classifier by leveraging techniques from optimal uncertainty quantification over Wasserstein balls.


Fundamentals of Machine Learning [Hindi][Python]

#artificialintelligence

Online Courses Udemy - Machine Learning, Fundamentals of Machine Learning [Hindi][Python] Complete hands-on Machine Learning Course with Data Science, NLP, Deep Learning and Artificial Intelligence Created by Rishi Bansal English Students also bought Machine Learning and AI: Support Vector Machines in Python Data Science: Supervised Machine Learning in Python Machine Learning A-Z: Hands-On Python & R In Data Science Machine Learning, Data Science and Deep Learning with Python Data Science and Machine Learning Bootcamp with R Machine Learning Practical: 6 Real-World Applications Preview this course GET COUPON CODE Description This course is designed to understand basic Concept of Machine Learning. Anyone can opt for this course. No prior understanding of Machine Learning is required. NOTE: Course is still under Development. You will see new topics will get added regularly. Now question is why this course?