Regression
Classification-Based Machine Learning for Finance
Finally, a comprehensive hands-on machine learning course with specific focus on classification based models for the investment community and passionate investors. In the past few years, there has been a massive adoption and growth in the use of data science, artificial intelligence and machine learning to find alpha. However, information on and application of machine learning to investment are scarce. This course has been designed to address that. It is meant to spark your creative juices and get you started in this space.
Regression Models Coursera
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist's toolkit. This course covers regression analysis, least squares and inference using regression models. Special cases of the regression model, ANOVA and ANCOVA will be covered as well. Analysis of residuals and variability will be investigated.
What a CEO needs to know about Machine Learning algorithms
During my first project in McKinsey in 2011, I served the CEO of a bank regarding his small business strategy. I wanted to run a linear regression on the bank's data but my boss told me: "Don't do it. Artificial Intelligence is the most general-purpose technology of our time. New products and processes are being developed thanks to better vision systems, speech recognition technologies or recommendation engines based on Machine Learning. In fact, most recent advances in Artificial Intelligence have been achieved in the area of Machine Learning. Long before McKinsey, in 2004, I started my career as a mobile software developer. At that time I had to write precise instructions for every step of my code. Developing the voice recognition system of today's phones would have been tedious and error-prone back then. It would have required literally hundreds of thousands of detailed instructions to codify every single step, including identifying phonemes from sound waves, grouping them into ...
How To Build a Basic Website Based on Real-Time Predictions
The first model (using a logistic regression classifier) gave me an accuracy response of more than 80 percent and an AUC of 84 percent. The results were also meaningful - they showed that the probability of an account belonging to a company increases with the number of Tweets, likes, or followings. On the other side, if you've linked to an Instagram or LinkedIn account in your bio, you are more likely to be a human.
A Beginner's Guide to EDA with Linear Regression -- Part 3
Mother Race -- but what we are seeing at X-Axis here is a bunch of variables. When you look closer you would notice that each variable seems to be representing each unique value of Mother Race variable. Linear Regression function'lm' in R automatically transforms a categorical variable into something called'dummy' variables. It will create a column for each categorical value (e.g. Japanese) and have a value of 0 or 1 based on whether a given row matches a given column (e.g.
How Many Machines Can We Use in Parallel Computing for Kernel Ridge Regression?
Liu, Meimei, Shang, Zuofeng, Cheng, Guang
This paper attempts to solve a basic problem in distributed statistical inference: how many machines can we use in parallel computing? In kernel ridge regression, we address this question in two important settings: nonparametric estimation and hypothesis testing. Specifically, we find a range for the number of machines under which optimal estimation/testing is achievable. The employed empirical processes method provides a unified framework, that allows us to handle various regression problems (such as thin-plate splines and nonparametric additive regression) under different settings (such as univariate, multivariate and diverging-dimensional designs). It is worth noting that the upper bounds of the number of machines are proven to be un-improvable (up to a logarithmic factor) in two important cases: smoothing spline regression and Gaussian RKHS regression. Our theoretical findings are backed by thorough numerical studies.
A Beginner's Guide to EDA with Linear Regression -- Part 2
So far, we have investigated if Father Age and Mother Age were impacting Gestation Week, and we know that both Father Age and Mother Age influence the changes in Gestation Week. But since we have done the investigation separately, one for Father Age's influence on Gestation Week and another for Mother's Age's influence on Gestation Week, we still don't know which of Father Age and Mother Age is the direct cause of the influence. In this post, I'm going to investigate further to find this out. So far, we know that the increases in Father Age would make Gestation Week shorter. And, the increases in Mother Age would also make Gestation Week shorter.
A Beginner's Guide to Exploratory Data Analysis with Linear Regression -- Part 1
Linear Regression is an algorithm that helps us predict unknown numeric outcome in future. It is usually the first Machine Learning (or Statistical) algorithms to learn when you are stepping into the world of Data Science or Machine Learning. Though it is one of the'old school' Statistical algorithms, it is still the most often used algorithm among many data scientists even today thanks to its simplicity and explainability. We at Exploratory always focus on, as the name suggests, making Exploratory Data Analysis (EDA) easier. EDA is a practice of iteratively asking a series of questions about data and trying to gain useful insights out of the data to answer the questions and essentially to influence our decision making.
Deep Learning Prerequisites: Logistic Regression in Python
This course is a lead-in to deep learning and neural networks - it covers a popular and fundamental technique used in machine learning, data science and statistics: logistic regression. We cover the theory from the ground up: derivation of the solution, and applications to real-world problems. We show you how one might code their own logistic regression module in Python. This course does not require any external materials. Everything needed (Python, and some Python libraries) can be obtained for free.