AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Decision Tree Algorithm -A Complete Guide - Analytics Vidhya

#artificialintelligenceAug-29-2021, 20:01:13 GMT

Till now we have learned about linear regression, logistic regression, and they were pretty hard to understand. Let's now start with Decision tree's and I assure you this is probably the easiest algorithm in Machine Learning. There's not much mathematics involved here. Since it is very easy to use and interpret it is one of the most widely used and practical methods used in Machine Learning. Root Nodes – It is the node present at the beginning of a decision tree from this node the population starts dividing according to various features.

decision tree, entropy, node, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.98)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

Add feedback

All the Datasets You Need to Practice Data Science Skills and Make a Great Portfolio

#artificialintelligenceAug-28-2021, 12:45:35 GMT

Every time I attempt to do a project for learning a new topic or for a project I spend a significant amount of time finding a suitable dataset for that. That way I have quite a lot of datasets that helped me learn and do some cool projects for my portfolio. I am going to share those datasets in this article so that you have a dataset to practice and make your portfolio. This dataset has information on the Olympic results. Each row contains the data of a country. This dataset will give you a taste of data cleaning to start with.

dataset, exploratory data analysis, towardsdatascience, (11 more...)

#artificialintelligence

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Math Behind Logistic Regression

#artificialintelligenceAug-28-2021, 07:38:01 GMT

Before we understand the bizarre symbols used in Logistic Regression, let's recollect the underlying idea of this technique. Logistic Regression is a machine learning technique that is widely used for classification problems. The definition above indicates that the algorithm is also useful for problems other than classification, regression for example. But this article will be centered around classification only. How does a classification problem look like?

equation, logistic regression, probability value, (11 more...)

#artificialintelligence

Country: Asia > India (0.05)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Improve Linear Regression for Time Series Forecasting

#artificialintelligenceAug-28-2021, 05:54:44 GMT

Time series forecasting is a very fascinating task. However, build a machine-learning algorithm to predict future data is trickier than expected. The hardest thing to handle is the temporal dependency present in the data. By their nature, time-series data are subject to shifts. This may result in temporal drifts of various kinds which may become our algorithm inaccurate.

linear model, linear tree, time series forecasting, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback

Targeting Underrepresented Populations in Precision Medicine: A Federated Transfer Learning Approach

Li, Sai, Cai, Tianxi, Duan, Rui

arXiv.org Machine LearningAug-27-2021

The limited representation of minorities and disadvantaged populations in large-scale clinical and genomics research has become a barrier to translating precision medicine research into practice. Due to heterogeneity across populations, risk prediction models are often found to be underperformed in these underrepresented populations, and therefore may further exacerbate known health disparities. In this paper, we propose a two-way data integration strategy that integrates heterogeneous data from diverse populations and from multiple healthcare institutions via a federated transfer learning approach. The proposed method can handle the challenging setting where sample sizes from different populations are highly unbalanced. With only a small number of communications across participating sites, the proposed method can achieve performance comparable to the pooled analysis where individual-level data are directly pooled together. We show that the proposed method improves the estimation and prediction accuracy in underrepresented populations, and reduces the gap of model performance across populations. Our theoretical analysis reveals how estimation accuracy is influenced by communication budgets, privacy restrictions, and heterogeneity across populations. We demonstrate the feasibility and validity of our methods through numerical experiments and a real application to a multi-center study, in which we construct polygenic risk prediction models for Type II diabetes in AA population.

estimator, heterogeneity, initialization, (15 more...)

arXiv.org Machine Learning

2108.12112

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom (0.04)
Asia > China (0.04)

Genre: Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.88)
Health & Medicine > Health Care Technology > Medical Record (0.68)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Polynomial Regression in Machine Learning

#artificialintelligenceAug-26-2021, 09:25:47 GMT

Polynomial Regression is one of the important parts of Machine Learning. Polynomial Regression is a regression algorithm that frames a relationship between the independent variable(x) and dependent variable(y) as nth degree polynomial. Basically, it brings forth the finest estimation for dependent and independent variables. To convert the multiple linear regression into polynomial regression we need to add some polynomial terms. It acts as a saver when we have to deal with a dataset that is not linearly separable.

machine learning, polynomial regression, regression, (9 more...)

#artificialintelligence

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Causal Inference in the Wild

#artificialintelligenceAug-26-2021, 00:47:06 GMT

Causal inference is a hot topic in machine learning, and there are many excellent primers on the theory of causal inference available [1–4]. But much fewer examples of real-world applications of machine-learning-powered causal inference exist. This article introduces one such example from an industry context, using a (public) real-world dataset. It is aimed at a technical audience with an understanding of the basics of causality. Specifically, I will look at the "ideal" scenario of price elasticity estimation [2, 5].

confounder, elasticity, retailer, (14 more...)

#artificialintelligence

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Hamilton (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Collinearity in Regression Model

#artificialintelligenceAug-25-2021, 14:30:15 GMT

To make it more clear why collinearity is such a problem, let's take a look at a use case. For the use case, I am going to use the car dataset that you can download easily on Kaggle. Let's imagine we want to predict the price of a car, or price variable in the dataset. To predict it, we will use certain independent variables such as the car's city MPG, highway MPG, horsepower, engine size, stroke, width, peak RPM, and compression ratio. Next, we build a regression model based on these independent variables.

city mpg, highway mpg, regression model, (3 more...)

#artificialintelligence

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Heavy-tailed Streaming Statistical Estimation

Tsai, Che-Ping, Prasad, Adarsh, Balakrishnan, Sivaraman, Ravikumar, Pradeep

arXiv.org Machine LearningAug-25-2021

We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gradients, which we show is critical when analyzing stochastic optimization problems arising from general statistical estimation problems. Our results guarantee convergence not just in expectation but with exponential concentration, and moreover does so using $O(1)$ batch size. We provide consequences of our results for mean estimation and linear regression. Finally, we provide empirical corroboration of our results and algorithms via synthetic experiments for mean estimation and linear regression.

algorithm, gradient, regression, (13 more...)

arXiv.org Machine Learning

2108.11483

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

Predicting Census Survey Response Rates via Interpretable Nonparametric Additive Models with Structured Interactions

Ibrahim, Shibal, Mazumder, Rahul, Radchenko, Peter, Ben-David, Emanuel

arXiv.org Machine LearningAug-24-2021

Accurate and interpretable prediction of survey response rates is important from an operational standpoint. The US Census Bureau's well-known ROAM application uses principled statistical models trained on the US Census Planning Database data to identify hard-to-survey areas. An earlier crowdsourcing competition revealed that an ensemble of regression trees led to the best performance in predicting survey response rates; however, the corresponding models could not be adopted for the intended application due to limited interpretability. In this paper, we present new interpretable statistical methods to predict, with high accuracy, response rates in surveys. We study sparse nonparametric additive models with pairwise interactions via $\ell_0$-regularization, as well as hierarchically structured variants that provide enhanced interpretability. Despite strong methodological underpinnings, such models can be computationally challenging -- we present new scalable algorithms for learning these models. We also establish novel non-asymptotic error bounds for the proposed estimators. Experiments based on the US Census Planning Database demonstrate that our methods lead to high-quality predictive models that permit actionable interpretability for different segments of the population. Interestingly, our methods provide significant gains in interpretability without losing in predictive performance to state-of-the-art black-box machine learning methods based on gradient boosting and feedforward neural networks. Our code implementation in python is available at https://github.com/ShibalIbrahim/Additive-Models-with-Structured-Interactions.

ac 13 17, interaction, interaction effect, (16 more...)

arXiv.org Machine Learning

2108.11328

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(7 more...)

Genre: Questionnaire & Opinion Survey (1.00)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
(2 more...)

Add feedback