AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

An Upgraded Marketing Mix Modeling in Python

#artificialintelligenceSep-23-2021, 20:25:11 GMT

In my last article, I introduced you to the world of marketing mix modeling. If you have not read it so far, please do before you proceed. There, we have a created a linear regression model that is able to predict sales based on raw advertising spends in several advertising channels, such as TV, radio, web banners. For me as a machine learning practitioner, such a model is nice already on its own. Even better, it also makes business people happy because the model lets us calculate ROIs, allowing us to judge how well each channel performed.

artificial intelligence, machine learning, transformer, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

Introduction to Polynomial Regression Analysis

#artificialintelligenceSep-23-2021, 06:25:16 GMT

Polynomial regression is one of the machine learning algorithms used for making predictions. For example, polynomial regression is widely applied to predict the spread rate of COVID-19 and other infectious diseases. If you would like to learn more about what polynomial regression analysis is, continue reading. Regression analysis is a helpful statistical tool for studying the correlation between two sets of events, or, statistically speaking, variables ― between a dependent variable and one or more independent variables. For example, your weight loss (dependent variable) depends on the number of hours you spend in the gym (independent variable).

linear regression, polynomial regression, regression, (12 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.83)
Research Report > Experimental Study (0.83)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.57)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

ML-Logistic Regression

#artificialintelligenceSep-23-2021, 04:45:55 GMT

There are other optimization algorithms than gradient descent. These algorithms automatically pick the appropriate learning rate alpha, and are usually faster. One way to do this is to do a "One vs all" binary classification. To do this we do a binary classification with a certain class and all the other classes, and select the largest one that has the largest hypothesis output. Since we have 3 classes here, we do the binary classification 3 times.

binary classification, ml-logistic regression

#artificialintelligence

Genre:

Research Report > New Finding (0.40)
Research Report > Experimental Study (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.40)

Add feedback

A Survey on Cost Types, Interaction Schemes, and Annotator Performance Models in Selection Algorithms for Active Learning in Classification

Herde, Marek, Huseljic, Denis, Sick, Bernhard, Calma, Adrian

arXiv.org Machine LearningSep-23-2021

Pool-based active learning (AL) aims to optimize the annotation process (i.e., labeling) as the acquisition of annotations is often time-consuming and therefore expensive. For this purpose, an AL strategy queries annotations intelligently from annotators to train a high-performance classification model at a low annotation cost. Traditional AL strategies operate in an idealized framework. They assume a single, omniscient annotator who never gets tired and charges uniformly regardless of query difficulty. However, in real-world applications, we often face human annotators, e.g., crowd or in-house workers, who make annotation mistakes and can be reluctant to respond if tired or faced with complex queries. Recently, a wide range of novel AL strategies has been proposed to address these issues. They differ in at least one of the following three central aspects from traditional AL: (1) They explicitly consider (multiple) human annotators whose performances can be affected by various factors, such as missing expertise. (2) They generalize the interaction with human annotators by considering different query and annotation types, such as asking an annotator for feedback on an inferred classification rule. (3) They take more complex cost schemes regarding annotations and misclassifications into account. This survey provides an overview of these AL strategies and refers to them as real-world AL. Therefore, we introduce a general real-world AL strategy as part of a learning cycle and use its elements, e.g., the query and annotator selection algorithm, to categorize about 60 real-world AL strategies. Finally, we outline possible directions for future research in the field of AL.

annotation, annotator, query, (15 more...)

arXiv.org Machine Learning

2109.11301

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Singapore (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(59 more...)

Genre: Overview (1.00)

Industry:

Energy (0.67)
Information Technology (0.67)
Education > Educational Setting (0.46)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
(3 more...)

Add feedback

High-dimensional regression with potential prior information on variable importance

Stokell, Benjamin G., Shah, Rajen D.

arXiv.org Machine LearningSep-23-2021

There are a variety of settings where vague prior information may be available on the importance of predictors in high-dimensional regression settings. Examples include ordering on the variables offered by their empirical variances (which is typically discarded through standardisation), the lag of predictors when fitting autoregressive models in time series settings, or the level of missingness of the variables. Whilst such orderings may not match the true importance of variables, we argue that there is little to be lost, and potentially much to be gained, by using them. We propose a simple scheme involving fitting a sequence of models indicated by the ordering. We show that the computational cost for fitting all models when ridge regression is used is no more than for a single fit of ridge regression, and describe a strategy for Lasso regression that makes use of previous fits to greatly speed up fitting the entire sequence of models. We propose to select a final estimator by cross-validation and provide a general result on the quality of the best performing estimator on a test set selected from among a number $M$ of competing estimators in a high-dimensional linear regression setting. Our result requires no sparsity assumptions and shows that only a $\log M$ price is incurred compared to the unknown best estimator. We demonstrate the effectiveness of our approach when applied to missing or corrupted data, and time series settings. An R package is available on github.

information, regression, variance, (16 more...)

arXiv.org Machine Learning

2109.11281

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Amazon SageMaker tutorial and model

#artificialintelligenceSep-22-2021, 21:50:39 GMT

This code pattern describes a way to gain insights by using Watson OpenScale and a SageMaker machine learning model. It explains how to create a logistic regression model using Amazon SageMaker with data from the UC Irvine machine learning database. The pattern uses Watson OpenScale to bind the machine learning model deployed in the AWS cloud, create a subscription, and perform payload and feedback logging. With Watson OpenScale, you can monitor model quality and log payloads, regardless of where the model is hosted. This code pattern uses the example of an Amazon Web Service (AWS) SageMaker model, which demonstrates the independent and open nature of Watson OpenScale.

amazon sagemaker tutorial and model, neural network, watson openscale, (5 more...)

#artificialintelligence

Industry: Information Technology (0.77)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.58)

Add feedback

Facilitating human-wildlife cohabitation through conflict prediction

Ghosh, Susobhan, Varakantham, Pradeep, Bhatkhande, Aniket, Ahmad, Tamanna, Andheria, Anish, Li, Wenjun, Taneja, Aparna, Thakkar, Divy, Tambe, Milind

arXiv.org Artificial IntelligenceSep-22-2021

With increasing world population and expanded use of forests as cohabited regions, interactions and conflicts with wildlife are increasing, leading to large-scale loss of lives (animal and human) and livelihoods (economic). While community knowledge is valuable, forest officials and conservation organisations can greatly benefit from predictive analysis of human-wildlife conflict, leading to targeted interventions that can potentially help save lives and livelihoods. However, the problem of prediction is a complex socio-technical problem in the context of limited data in low-resource regions. Identifying the "right" features to make accurate predictions of conflicts at the required spatial granularity using a sparse conflict training dataset} is the key challenge that we address in this paper. Specifically, we do an illustrative case study on human-wildlife conflicts in the Bramhapuri Forest Division in Chandrapur, Maharashtra, India. Most existing work has considered human-wildlife conflicts in protected areas and to the best of our knowledge, this is the first effort at prediction of human-wildlife conflicts in unprotected areas and using those predictions for deploying interventions on the ground.

conflict, dataset, human-wildlife conflict, (14 more...)

arXiv.org Artificial Intelligence

2109.10637

Country:

Asia > India > Maharashtra (0.24)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Supervised Learning algorithms cheat-sheet

#artificialintelligenceSep-21-2021, 18:00:46 GMT

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used later for mapping new examples. The most popular supervised learning tasks are: Regression and Classification. The result of solving the regression task is a model that can make numerical predictions. The result of solving the classification task is a model that can make classes predictions.

algorithm, regression, supervised learning algorithm cheat-sheet, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.37)

Add feedback

Simple Linear Regression

#artificialintelligenceSep-21-2021, 09:21:13 GMT

Linear regression is an algorithm used to predict or visualise a relationship between two different features/variables. In linear regression tasks, there are two kinds of variables being examined: the dependent variable and the independent variable. Let us build our first Simple Linear Regression Model and learn along the way by building. This particular model is called as simple because it has only one independent variable. Here we are using the data containing people's salary and working experience to predict someone's salary based on their experience.

independent variable, linearregression class, simple linear regression, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Personalized Online Machine Learning

Malenica, Ivana, Phillips, Rachael V., Pirracchio, Romain, Chambaz, Antoine, Hubbard, Alan, van der Laan, Mark J.

arXiv.org Machine LearningSep-21-2021

In this work, we introduce the Personalized Online Super Learner (POSL) -- an online ensembling algorithm for streaming data whose optimization procedure accommodates varying degrees of personalization. Namely, POSL optimizes predictions with respect to baseline covariates, so personalization can vary from completely individualized (i.e., optimization with respect to baseline covariate subject ID) to many individuals (i.e., optimization with respect to common baseline covariates). As an online algorithm, POSL learns in real-time. POSL can leverage a diversity of candidate algorithms, including online algorithms with different training and update times, fixed algorithms that are never updated during the procedure, pooled algorithms that learn from many individuals' time-series, and individualized algorithms that learn from within a single time-series. POSL's ensembling of this hybrid of base learning strategies depends on the amount of data collected, the stationarity of the time-series, and the mutual characteristics of a group of time-series. In essence, POSL decides whether to learn across samples, through time, or both, based on the underlying (unknown) structure in the data. For a wide range of simulations that reflect realistic forecasting scenarios, and in a medical data application, we examine the performance of POSL relative to other current ensembling and online learning methods. We show that POSL is able to provide reliable predictions for time-series data and adjust to changing data-generating environments. We further cultivate POSL's practicality by extending it to settings where time-series enter/exit dynamically over chronological time.

algorithm, estimator, learner, (13 more...)

arXiv.org Machine Learning

2109.10452

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
(4 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback