AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Modelling tourism demand to Spain with machine learning techniques. The impact of forecast horizon on model selection

Claveria, Oscar, Monte, Enric, Torra, Salvador

arXiv.org Machine LearningMay-2-2018

This study assesses the influence of the forecast horizon on the forecasting performance of several machine learning techniques. We compare the fo recast accuracy of Support Vector Regression (SVR) to Neural Network (NN) models, using a linear model as a benchmark. We focus on international tourism demand to all seventeen regions of Spain. The SVR with a Gaussian radial basis function kernel outperforms the rest of the models for the longest forecast horizons. We also find that machine learning methods improve their forecasting accuracy with respect to linear models as forecast horizons increase. This result shows the suitability of SVR for medium and long term forecasting.

artificial intelligence, machine learning, tourism demand, (16 more...)

arXiv.org Machine Learning

1805.00878

Country:

Europe > Spain (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)

Add feedback

Modelling cross-dependencies between Spain's regional tourism markets with an extension of the Gaussian process regression model

Claveria, Oscar, Monte, Enric, Torra, Salvador

arXiv.org Machine LearningMay-2-2018

This study presents an extension of the Gaussian process regression model for multiple-input multiple-output forecasting. This approach allows modelling the cross-dependencies between a given set of input variables and generating a vectorial prediction. Making use of the existing correlations in international tourism demand to all seventeen regions of Spain, the performance of the proposed model is assessed in a multiple-step-ahead forecasting comparison. The results of the experiment in a multivariate setting show that the Gaussian process regression model significantly improves the forecasting accuracy of a multi-layer perceptron neural network used as a benchmark. The results reveal that incorporating the connections between different markets in the modelling process may prove very useful to refine predictions at a regional level.

artificial intelligence, forecasting, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1007/s13209-016-0144-7

1805.00861

Country: Europe > Spain > Catalonia (0.15)

Genre: Research Report > New Finding (0.48)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Building A Linear Regression with PySpark and MLlib

#artificialintelligenceMay-1-2018, 20:42:03 GMT

Apache Spark has become one of the most commonly used and supported open-source tools for machine learning and data science. In this post, I'll help you get started using Apache Spark's spark.ml Our data is from the Kaggle competition: Housing Values in Suburbs of Boston. AGE -- proportion of owner-occupied units built prior to 1940. BLACK -- 1000(Bk -- 0.63)² where Bk is the proportion of blacks by town. This is the target variable.

artificial intelligence, correlation, machine learning, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.42)

Add feedback

Solid Harmonic Wavelet Scattering for Predictions of Molecule Properties

Eickenberg, Michael, Exarchakis, Georgios, Hirn, Matthew, Mallat, Stéphane, Thiry, Louis

arXiv.org Machine LearningMay-1-2018

We present a machine learning algorithm for the prediction of molecule properties inspired by ideas from density functional theory. Using Gaussian-type orbital functions, we create surrogate electronic densities of the molecule from which we compute invariant "solid harmonic scattering coefficients" that account for different types of interactions at different scales. Multi-linear regressions of various physical properties of molecules are computed from these invariant coefficients. Numerical experiments show that these regressions have near state of the art performance, even with relatively few training examples. Predictions over small sets of scattering coefficients can reach a DFT precision while being interpretable.

artificial intelligence, coefficient, machine learning, (16 more...)

arXiv.org Machine Learning

1805.00571

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Add feedback

scikit-learn –Test Predictions Using Various Models

@machinelearnbotApr-30-2018, 06:12:49 GMT

Scikit-learn has evolved as a robust library for machine learning applications in Python with support for a wide range of supervised and unsupervised learning algorithms. This course begins by taking you through videos on linear models; with scikit-learn, you will take a machine learning approach to linear regression. As you progress, you will explore logistic regression. Then you will build models with distance metrics, including clustering. You will also look at cross-validation and post-model workflows, where you will see how to select a model that predicts well.

artificial intelligence, machine learning, test prediction, (4 more...)

@machinelearnbot

Country: North America > United States > Kansas (0.23)

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Add feedback

OMG - Emotion Challenge Solution

Cui, Yuqi, Zhang, Xiao, Wang, Yang, Guo, Chenfeng, Wu, Dongrui

arXiv.org Machine LearningApr-30-2018

Abstract--This short paper describes our solution to the 2018 IEEE World Congress on Computational Intelligence One-Minute Gradual-Emotional Behavior Challenge, whose goal was to estimate continuous arousal and valence values from short videos. We designed four base regression models using visual and audio features, and then used a spectral approach to fuse them to obtain improved performance. (IEEE WCCI 2018). The dataset was composed of 420 relatively long emotion videos with an average length of 1 minute, collected from a variety of Youtube channels. Videos were separated into clips based on utterances, and each utterance's valence and arousal levels were annotated by at least five independent subjects using the Amazon Mechanical Turk tool.

artificial intelligence, machine learning, utterance, (15 more...)

arXiv.org Machine Learning

1805.00348

Country:

North America (0.48)
Asia > China (0.30)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.39)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Add feedback

Steps of Modelling

@machinelearnbotApr-29-2018, 05:48:42 GMT

The data are usually recorded in rows and columns. A column represents a variable,whereas a row represents an observation, which is a set of p 1 values for a single subject i.e. one value for the response variable and one value for each of the p predictors. Each of the variables can be classified as either quantitative or qualitative. A technique used in cases where the response variable is binary is called logistic regression. In regression analysis, the predictor variables can be either quantitative and or qualitative. For the purpose of computations, however, the qualitative variables, if any, have to be coded into a set of indicator or dummy variables.

artificial intelligence, machine learning, predictor variable, (9 more...)

@machinelearnbot

Genre:

Research Report > New Finding (0.63)
Research Report > Experimental Study (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models

Yu, Ming, Ramamurthy, Karthikeyan Natesan, Thompson, Addie, Lozano, Aurélie

arXiv.org Machine LearningApr-29-2018

We consider multi-response and multitask regression models, where the parameter matrix to be estimated is expected to have an unknown grouping structure. The groupings can be along tasks, or features, or both, the last one indicating a bi-cluster or "checkerboard" structure. Discovering this grouping structure along with parameter inference makes sense in several applications, such as multi-response Genome-Wide Association Studies. This additional structure can not only can be leveraged for more accurate parameter estimation, but it also provides valuable information on the underlying data mechanisms (e.g. relationships among genotypes and phenotypes in GWAS). In this paper, we propose two formulations to simultaneously learn the parameter matrix and its group structures, based on convex regularization penalties. We present optimization approaches to solve the resulting problems and provide numerical convergence guarantees. Our approaches are validated on extensive simulations and real datasets concerning phenotypes and genotypes of plant varieties.

artificial intelligence, formulation, machine learning, (18 more...)

arXiv.org Machine Learning

1804.10961

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Top 6 errors novice machine learning engineers make

#artificialintelligenceApr-28-2018, 04:26:22 GMT

In machine learning, there are many ways to build a product or solution and each way assumes something different. Many times, it's not obvious how to navigate and identify which assumptions are reasonable. People new to machine learning make mistakes, which in hindsight will often feel silly. I've created a list of the top mistakes that novice machine learning engineers make. Hopefully, you can learn from these common errors and create more robust solutions that bring real value.

artificial intelligence, machine learning, novice machine, (11 more...)

#artificialintelligence

Genre: Research Report (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Novel Prediction Techniques Based on Clusterwise Linear Regression

Gitman, Igor, Chen, Jieshi, Lei, Eric, Dubrawski, Artur

arXiv.org Machine LearningApr-28-2018

In this paper we explore different regression models based on Clusterwise Linear Regression (CLR). CLR aims to find the partition of the data into $k$ clusters, such that linear regressions fitted to each of the clusters minimize overall mean squared error on the whole data. The main obstacle preventing to use found regression models for prediction on the unseen test points is the absence of a reasonable way to obtain CLR cluster labels when the values of target variable are unknown. In this paper we propose two novel approaches on how to solve this problem. The first approach, predictive CLR builds a separate classification model to predict test CLR labels. The second approach, constrained CLR utilizes a set of user-specified constraints that enforce certain points to go to the same clusters. Assuming the constraint values are known for the test points, they can be directly used to assign CLR labels. We evaluate these two approaches on three UCI ML datasets as well as on a large corpus of health insurance claims. We show that both of the proposed algorithms significantly improve over the known CLR-based regression methods. Moreover, predictive CLR consistently outperforms linear regression and random forest, and shows comparable performance to support vector regression on UCI ML datasets. The constrained CLR approach achieves the best performance on the health insurance dataset, while enjoying only $\approx 20$ times increased computational time over linear regression.

artificial intelligence, machine learning, regression, (19 more...)

arXiv.org Machine Learning

1804.10742

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance > Insurance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback