Regression
Alexander Jung
This lecture discusses how decision trees can be used to represent predictor functions. Variations of the basic decision tree model provide some of the most powerful machine learning methods curren... Alexander Jung uploaded a video 1 week ago Classification Methods - Duration: 46 minutes. Our focus is on linear regression methods which can be expanded by feature constructions. Guest lecture of Prof. Minna Huotilainen on learning processes in human brains. Alexander Jung subscribed to a channel 3 weeks ago Playing For Change - Channel PFC is a movement created to inspire and connect the world through music. The idea for this project came from a common belief that music has the power to break down boundaries and overcome distances SubscribeSubscribedUnsubscribe1.9M This video explains how network Lasso can be used to learn localized linear models that allow "personalized" predictions for individual data points within a network.
SAS Tutorial What is logistic regression?
In this SAS How To Tutorial, Christa Cody provides an introduction to logistic regression and looks at how to perform logistic regression in SAS. After a brief introduction, she will show how to do some basic procedures to your data and fitting the model in SAS Studio. Finally, Christa will demo how to do similar tasks using SAS Model Studio. Download Data Files Download the HMEQ data set that Christa uses http://support.sas.com/documentation/... Content Outline 00:23 โ Intro to Logistic Regression 04:52 โ Fit the model in SAS Studio 11:31 โ Show similar tasks in SAS Model Studio 12:41 โ Why use logistic regression? The LOGISTIC Procedure โ http://support.sas.com/documentation/... Beyond Binary Outcomes paper โ http://support.sas.com/resources/pape... Free Statistics 1 e-Course โ https://support.sas.com/edu/schedules... Free Intro to Statistical Concepts e-Course โ https://support.sas.com/edu/schedules... Statistical Analysis learning path โ http://support.sas.com/training/us/pa... SAS Tutorials on Logistic Regression โ https://video.sas.com/detail/video/57... SUBSCRIBE TO THE SAS USERS YOUTUBE CHANNEL #SASUsers #LearnSAS https://www.youtube.com/SASUsers?sub_... ABOUT SAS SAS is a trusted analytics powerhouse for organizations seeking immediate value from their data.
Fair Policy Targeting
Viviano, Davide, Bradic, Jelena
One of the major concerns of targeting interventions on individuals in social welfare programs is discrimination: individualized treatments may induce disparities on sensitive attributes such as age, gender, or race. This paper addresses the question of the design of fair and efficient treatment allocation rules. We adopt the non-maleficence perspective of "first do no harm": we propose to select the fairest allocation within the Pareto frontier. We provide envy-freeness justifications to novel counterfactual notions of fairness. We discuss easy-to-implement estimators of the policy function, by casting the optimization into a mixed-integer linear program formulation. We derive regret bounds on the unfairness of the estimated policy function, and small sample guarantees on the Pareto frontier. Finally, we illustrate our method using an application from education economics.
Hedging with Neural Networks
We study neural networks as nonparametric estimation tools for the hedging of options. To this end, we design a network, named HedgeNet, that directly outputs a hedging strategy. This network is trained to minimise the hedging error instead of the pricing error. Applied to end-of-day and tick prices of S&P 500 and Euro Stoxx 50 options, the network is able to reduce the mean squared hedging error of the Black-Scholes benchmark significantly. We illustrate, however, that a similar benefit arises by simple linear regressions that incorporate the leverage effect. Finally, we show how a faulty training/test data split, possibly along with an additional 'tagging' of data, leads to a significant overestimation of the outperformance of neural networks.
An easy guide to choose the right Machine Learning algorithm - KDnuggets
Well, there is no straightforward and sure-shot answer to this question. The answer depends on many factors like the problem statement and the kind of output you want, type and size of the data, the available computational time, number of features, and observations in the data, to name a few. Here are some important considerations while choosing an algorithm. It is usually recommended to gather a good amount of data to get reliable predictions. However, many a time, the availability of data is a constraint.
Interpreting the Coefficients of a Regression Model with an Interaction Term: A Detailedโฆ
Adding an interaction term to a regression model becomes necessary when the relationship between an explanatory variable and an outcome variable depends on the value/level of another explanatory variable. Although the addition of an interaction term can result in a more meaningful empirical model, it simultaneously complicates the interpretation of model coefficients. In this article, we are going to learn how to interpret the coefficients of a regression model that includes a two-way interaction term. By the end of this article, we should understand how the interpretation of model coefficients differs between a model with an interaction term and a model without an interaction term. We are going to use the statistical software R for building the models and visualizing the outcomes.
Build and deploy your first machine learning web app - KDnuggets
In our last post we demonstrated how to train and deploy machine learning models in Power BI using PyCaret. If you haven't heard about PyCaret before, please read our announcement to get a quick start. In this tutorial we will use PyCaret to develop a machine learning pipeline, that will include preprocessing transformations and a regression model to predict patient hospitalization charges based on demographic and basic patient health risk metrics such as age, BMI, smoking status etc. PyCaret is an open source, low-code machine learning library in Python to train and deploy machine learning pipelines and models in production. PyCaret can be installed easily using pip. Flask is a framework that allows you to build web applications.
Data-driven Efficient Solvers and Predictions of Conformational Transitions for Langevin Dynamics on Manifold in High Dimensions
Gao, Yuan, Liu, Jian-Guo, Wu, Nan
We work on dynamic problems with collected data $\{\mathsf{x}_i\}$ that distributed on a manifold $\mathcal{M}\subset\mathbb{R}^p$. Through the diffusion map, we first learn the reaction coordinates $\{\mathsf{y}_i\}\subset \mathcal{N}$ where $\mathcal{N}$ is a manifold isometrically embedded into an Euclidean space $\mathbb{R}^\ell$ for $\ell \ll p$. The reaction coordinates enable us to obtain an efficient approximation for the dynamics described by a Fokker-Planck equation on the manifold $\mathcal{N}$. By using the reaction coordinates, we propose an implementable, unconditionally stable, data-driven upwind scheme which automatically incorporates the manifold structure of $\mathcal{N}$. Furthermore, we provide a weighted $L^2$ convergence analysis of the upwind scheme to the Fokker-Planck equation. The proposed upwind scheme leads to a Markov chain with transition probability between the nearest neighbor points. We can benefit from such property to directly conduct manifold-related computations such as finding the optimal coarse-grained network and the minimal energy path that represents chemical reactions or conformational changes. To establish the Fokker-Planck equation, we need to acquire information about the equilibrium potential of the physical system on $\mathcal{N}$. Hence, we apply a Gaussian Process regression algorithm to generate equilibrium potential for a new physical system with new parameters. Combining with the proposed upwind scheme, we can calculate the trajectory of the Fokker-Planck equation on $\mathcal{N}$ based on the generated equilibrium potential. Finally, we develop an algorithm to pullback the trajectory to the original high dimensional space as a generative data for the new physical system.
Nonparametric inverse probability weighted estimators based on the highly adaptive lasso
Ertefaie, Ashkan, Hejazi, Nima S., van der Laan, Mark J.
Inverse probability weighted estimators are the oldest and potentially most commonly used class of procedures for the estimation of causal effects. By adjusting for selection biases via a weighting mechanism, these procedures estimate an effect of interest by constructing a pseudo-population in which selection biases are eliminated. Despite their ease of use, these estimators require the correct specification of a model for the weighting mechanism, are known to be inefficient, and suffer from the curse of dimensionality. We propose a class of nonparametric inverse probability weighted estimators in which the weighting mechanism is estimated via undersmoothing of the highly adaptive lasso, a nonparametric regression function proven to converge at $n^{-1/3}$-rate to the true weighting mechanism. We demonstrate that our estimators are asymptotically linear with variance converging to the nonparametric efficiency bound. Unlike doubly robust estimators, our procedures require neither derivation of the efficient influence function nor specification of the conditional outcome model. Our theoretical developments have broad implications for the construction of efficient inverse probability weighted estimators in large statistical models and a variety of problem settings. We assess the practical performance of our estimators in simulation studies and demonstrate use of our proposed methodology with data from a large-scale epidemiologic study.