AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Accuracy versus interpretability? With generalized additive models (GAMs), you can have both

#artificialintelligenceMar-26-2022, 02:01:24 GMT

In this post, I will provide an overview of generalized additive models (GAMs) and their desirable features. Predictive accuracy has long been an important goal of machine learning. But model interpretability has received more attention in recent years. Stakeholders, such as executives, regulators, and domain experts, often want to understand how and why a model makes its predictions before they trust it enough to use it in practice. However, when you train a machine learning model, you typically face a tradeoff between accuracy and interpretability.

component function, gam, interpretability, (15 more...)

#artificialintelligence

Genre: Overview (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

Add feedback

Stacked ensembles -- improving model performance on a higher level

#artificialintelligenceMar-26-2022, 00:18:50 GMT

One particular problem arises at this stage. How do we know what model to choose as the meta-model? Unfortunately, there has't been any research into this area and the selection of a meta-model is more of an art than a science. In most of the papers discussing stacked models, the meta-model used is often just a simple model such as Linear Regression for regression tasks and Logistic Regression for classification tasks. One reason why more complex meta-models are often not chosen is because there is a much higher chance that the meta-model may overfit to the predictions from the base models.

linear regression, model performance, regression, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.77)

Add feedback

All About Logistic Regression

#artificialintelligenceMar-22-2022, 12:12:57 GMT

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Logistic Regression is a Supervised Machine Learning algorithm that is used in classification problems where we have to distinguish the dependent variable between two or more categories or classes by using the independent variables.

classification model, independent variable, logistic regression, (11 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.79)
Research Report > Experimental Study (0.79)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)

Add feedback

Adaptive Verifiable Coded Computing: Towards Fast, Secure and Private Distributed Machine Learning

Tang, Tingting, Ali, Ramy E., Hashemi, Hanieh, Gangwani, Tynan, Avestimehr, Salman, Annavaram, Murali

arXiv.org Artificial IntelligenceMar-22-2022

Stragglers, Byzantine workers, and data privacy are the main bottlenecks in distributed cloud computing. Some prior works proposed coded computing strategies to jointly address all three challenges. They require either a large number of workers, a significant communication cost or a significant computational complexity to tolerate Byzantine workers. Much of the overhead in prior schemes comes from the fact that they tightly couple coding for all three problems into a single framework. In this paper, we propose Adaptive Verifiable Coded Computing (AVCC) framework that decouples the Byzantine node detection challenge from the straggler tolerance. AVCC leverages coded computing just for handling stragglers and privacy, and then uses an orthogonal approach that leverages verifiable computing to mitigate Byzantine workers. Furthermore, AVCC dynamically adapts its coding scheme to trade-off straggler tolerance with Byzantine protection. We evaluate AVCC on a compute-intensive distributed logistic regression application. Our experiments show that AVCC achieves up to $4.2\times$ speedup and up to $5.1\%$ accuracy improvement over the state-of-the-art Lagrange coded computing approach (LCC). AVCC also speeds up the conventional uncoded implementation of distributed logistic regression by up to $7.6\times$, and improves the test accuracy by up to $12.1\%$.

avcc, computation, node, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IPDPS53621.2022.00067

2107.12958

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > North Carolina (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Top Machine Learning Algorithms for Regression

#artificialintelligenceMar-21-2022, 12:56:43 GMT

Linear regression finds the optimal linear relationship between independent variables and dependent variables, thus makes prediction accordingly. The simplest form is y b0 b1x. When there is only one input feature, linear regression model fits the line in a 2 dimensional space, in order to minimize the residuals between predicted values and actual values. The common cost function to measure the magnitude of residuals is residual sum of squared (RSS). As more features are introduced, simple linear regression evolves into multiple linear regression y b0 b1x1 b2x2 … bnxn. Feel free to check out my article if you want the specific guide to simple linear regression model.

coefficient, regression, top machine learning algorithm, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

When regression coefficients change over time: A proposal

Schierholz, Malte

arXiv.org Machine LearningMar-19-2022

A common approach in forecasting problems is to estimate a least-squares regression (or other statistical learning models) from past data, which is then applied to predict future outcomes. An underlying assumption is that the same correlations that were observed in the past still hold for the future. We propose a model for situations when this assumption is not met: adopting methods from the state space literature, we model how regression coefficients change over time. Our approach can shed light on the large uncertainties associated with forecasting the future, and how much of this is due to changing dynamics of the past. Our simulation study shows that accurate estimates are obtained when the outcome is continuous, but the procedure fails for binary outcomes.

coefficient, equation, time point, (15 more...)

arXiv.org Machine Learning

2203.10302

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Logistic Regression with Example

#artificialintelligenceMar-18-2022, 03:19:29 GMT

Logistic Regression is a Supervised Machine Learning Algorithm utilized for classification. Examples for classification include: Email spam or ham, will buy or not buy a product, disease predictions such as cancerous or noncancerous cells. Logistic regression is a Probability problem. Meaning that the outcome of the algorithm is between 0 and 1. It maintains a threshold value to classify the data points(samples).

equation, exp, regression, (10 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.96)
Research Report > Experimental Study (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

GAM(L)A: An econometric model for interpretable Machine Learning

Flachaire, Emmanuel, Hacheme, Gilles, Hué, Sullivan, Laurent, Sébastien

arXiv.org Machine LearningMar-17-2022

Despite their high predictive performance, random forest and gradient boosting are often considered as black boxes or uninterpretable models which has raised concerns from practitioners and regulators. As an alternative, we propose in this paper to use partial linear models that are inherently interpretable. Specifically, this article introduces GAM-lasso (GAMLA) and GAM-autometrics (GAMA), denoted as GAM(L)A in short. GAM(L)A combines parametric and non-parametric functions to accurately capture linearities and non-linearities prevailing between dependent and explanatory variables, and a variable selection procedure to control for overfitting issues. Estimation relies on a two-step procedure building upon the double residual method. We illustrate the predictive performance and interpretability of GAM(L)A on a regression and a classification problem. The results show that GAM(L)A outperforms parametric models augmented by quadratic, cubic and interaction effects. Moreover, the results also suggest that the performance of GAM(L)A is not significantly different from that of random forest and gradient boosting.

algorithm, gamla, predictive performance, (15 more...)

arXiv.org Machine Learning

2203.11691

Country:

North America > United States (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance > Credit (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Stability and Risk Bounds of Iterative Hard Thresholding

Yuan, Xiao-Tong, Li, Ping

arXiv.org Machine LearningMar-17-2022

In this paper, we analyze the generalization performance of the Iterative Hard Thresholding (IHT) algorithm widely used for sparse recovery problems. The parameter estimation and sparsity recovery consistency of IHT has long been known in compressed sensing. From the perspective of statistical learning, another fundamental question is how well the IHT estimation would predict on unseen data. This paper makes progress towards answering this open question by introducing a novel sparse generalization theory for IHT under the notion of algorithmic stability. Our theory reveals that: 1) under natural conditions on the empirical risk function over $n$ samples of dimension $p$, IHT with sparsity level $k$ enjoys an $\mathcal{\tilde O}(n^{-1/2}\sqrt{k\log(n)\log(p)})$ rate of convergence in sparse excess risk; 2) a tighter $\mathcal{\tilde O}(n^{-1/2}\sqrt{\log(n)})$ bound can be established by imposing an additional iteration stability condition on a hypothetical IHT procedure invoked to the population risk; and 3) a fast rate of order $\mathcal{\tilde O}\left(n^{-1}k(\log^3(n)+\log(p))\right)$ can be derived for strongly convex risk function under proper strong-signal conditions. The results have been substantialized to sparse linear regression and sparse logistic regression models to demonstrate the applicability of our theory. Preliminary numerical evidence is provided to confirm our theoretical predictions.

excess risk, iht, probability, (17 more...)

arXiv.org Machine Learning

2203.09413

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(17 more...)

Genre: Research Report > New Finding (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Add feedback

How to Interpret Machine Learning Models with Python -- Part 1 (easy)

#artificialintelligenceMar-14-2022, 04:10:27 GMT

In this article, I will try to interpret the Linear Regression, Lasso, and Decision Tree models which are inherently interpretable. I will analyze global interpretability -- which analyzes the most important feature for prediction in general and local interpretability -- which explains individual prediction results. Machine learning models are used in applications such as fraud and risk detection in bank transactions, voice assistants, recommendation systems, chatbots, self-driving cars, social network analysis, etc. However, sometimes it is difficult to interpret them because the algorithm represents a black box(e.g. So we need additional techniques to analyze black box decisions.

explanation, interpret machine learning model, linear model, (13 more...)

#artificialintelligence

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.45)

Add feedback