AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Causality-based Explanation of Classification Outcomes

Bertossi, Leopoldo, Li, Jordan, Schleich, Maximilian, Suciu, Dan, Vagena, Zografoula

arXiv.org Artificial IntelligenceMar-15-2020

Machine-learning (ML) models are increasingly used today in making decisions that affect real people's lives, and, because of that, there is a huge need to ensure that the models and their decisions are interpretable by their human users. Motivated by this need, there has bee a lot of interest recently in the ML community in studying Interpretable models [18]. There is currently no consensus on what interpretability means, and no benchmarks for evaluating interpretability [5, 10]. The only consensus is that simpler models such as linear regression or decision trees are considered more interpretable than complex models like, say, deep neural nets. However, two general principles for approaching interpretability have emerged in the literature that are relevant to our paper.

explanation, resp-score, shap-score, (15 more...)

arXiv.org Artificial Intelligence

2003.06868

Country:

Europe (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry: Banking & Finance > Credit (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Data-driven surrogate modelling and benchmarking for process equipment

Gonçalves, Gabriel F. N., Batchvarov, Assen, Liu, Yuyi, Liu, Yuxin, Mason, Lachlan, Pan, Indranil, Matar, Omar K.

arXiv.org Machine LearningMar-13-2020

A suite of computational fluid dynamics (CFD) simulations geared towards chemical process equipment modelling has been developed and validated with experimental results from the literature. Various regression based active learning strategies are explored with these CFD simulators in-the-loop under the constraints of a limited function evaluation budget. Specifically, five different sampling strategies and five regression techniques are compared, considering a set of three test cases of industrial significance and varying complexity. Gaussian process regression was observed to have a consistently good performance for these applications. The present quantitative study outlines the pros and cons of the different available techniques and highlights the best practices for their adoption. The test cases and tools are available with an open-source license, to ensure reproducibility and engage the wider research community in contributing to both the CFD models and developing and benchmarking new improved algorithms tailored to this field.

neural network, simulation, upstream oil & gas, (21 more...)

arXiv.org Machine Learning

2003.07701

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Add feedback

A Time Series Approach To Player Churn and Conversion in Videogames

del Río, Ana Fernández, Guitart, Anna, Periáñez, África

arXiv.org Machine LearningMar-13-2020

Players of a free-to-play game are divided into three main groups: non-paying active users, paying active users and inactive users. A State Space time series approach is then used to model the daily conversion rates between the different groups, i.e., the probability of transitioning from one group to another. This allows, not only for predictions on how these rates are to evolve, but also for a deeper understanding of the impact that in-game planning and calendar effects have. It is also used in this work for the detection of marketing and promotion campaigns about which no information is available. In particular, two different State Space formulations are considered and compared: an Autoregressive Integrated Moving Average process and an Unobserved Components approach, in both cases with a linear regression to explanatory variables. Both yield very close estimations for covariate parameters, producing forecasts with similar performances for most transition rates. While the Unobserved Components approach is more robust and needs less human intervention in regards to model definition, it produces significantly worse forecasts for non-paying user abandonment probability. More critically, it also fails to detect a plausible marketing and promotion campaign scenario.

intervention, player churn and conversion, time sery approach, (11 more...)

arXiv.org Machine Learning

2003.10287

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.08)
(9 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Add feedback

Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

Fernandez, Camila, Chen, Chung Shue, Gaillard, Pierre, Silva, Alonso

arXiv.org Machine LearningMar-13-2020

In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2). We present two comparisons: one with the default hyper-parameters of these models and one with the best hyper-parameters found by randomized search.

concordance index, dataset, survival time, (12 more...)

arXiv.org Machine Learning

2003.0882

Country:

Europe > France (0.05)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

A brief introduction to Logistic Regression techsocialnetwork

#artificialintelligenceMar-11-2020, 11:55:06 GMT

In our previous chapters, we mainly discussed about the Linear Regression model where the target variable to be predicted is continous in nature and there is a linear relationship between the independent and target varables. But how to predict a discrete varaible based uopn the predictors which are linearly related with the target. In this case Logistic Regression comes to rescue. In this article, we will mainly focus on this predictive model and know the inner engineering of this model. So, What is Logistic Regression?

logistic regression, logistic regression techsocialnetwork, probability, (9 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Top 5 Data Science Algorithms that you must know!

#artificialintelligenceMar-11-2020, 11:35:00 GMT

Right now, we utilize different data science algorithms to solve the task needing to be done. There are many algorithms out there, so it tends to be quite overpowering for beginners. Today, we will quickly present the top 5 mainstream Machine Learning algorithms so you can get settled with the energizing universe of Data Science! Linear Regression is likely the most famous ML algorithm. It finds a line that best fits a dissipated data points on a graph.

algorithm, hyperplane, recurrent neural network, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.36)

Add feedback

Classical Statistics and Statistical Learning in Imaging Neuroscience

#artificialintelligenceMar-10-2020, 04:29:03 GMT

Single subject prediction of brain disorders in neuroimaging: promises and pitfalls.

algorithm, hypothesis, inference, (16 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)
Overview (0.67)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback

Multivariate Functional Regression via Nested Reduced-Rank Regularization

Liu, Xiaokang, Ma, Shujie, Chen, Kun

arXiv.org Machine LearningMar-10-2020

We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functional model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.

estimation, matrix, predictor, (14 more...)

arXiv.org Machine Learning

2003.04786

Country:

Oceania > Australia > South Australia (0.04)
North America > United States > New York (0.04)
North America > United States > Connecticut (0.04)
North America > United States > California > Riverside County > Riverside (0.04)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Short-Term Forecasting of CO2 Emission Intensity in Power Grids by Machine Learning

Leerbeck, Kenneth, Bacher, Peder, Junker, Rune, Goranović, Goran, Corradi, Olivier, Ebrahimy, Razgar, Tveit, Anna, Madsen, Henrik

arXiv.org Machine LearningMar-10-2020

A machine learning algorithm is developed to forecast the CO2 emission intensities in electrical power grids in the Danish bidding zone DK2, distinguishing between average and marginal emissions. The analysis was done on data set comprised of a large number (473) of explanatory variables such as power production, demand, import, weather conditions etc. collected from selected neighboring zones. The number was reduced to less than 50 using both LASSO (a penalized linear regression analysis) and a forward feature selection algorithm. Three linear regression models that capture different aspects of the data (non-linearities and coupling of variables etc.) were created and combined into a final model using Softmax weighted average. Cross-validation is performed for debiasing and autoregressive moving average model (ARIMA) implemented to correct the residuals, making the final model the variant with exogenous inputs (ARIMAX). The forecasts with the corresponding uncertainties are given for two time horizons, below and above six hours. Marginal emissions came up independent of any conditions in the DK2 zone, suggesting that the marginal generators are located in the neighbouring zones. The developed methodology can be applied to any bidding zone in the European electricity network without requiring detailed knowledge about the zone.

co 2, emission, forecast, (16 more...)

arXiv.org Machine Learning

2003.0574

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.26)
Europe > Sweden (0.05)
Europe > Germany (0.04)
(7 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Banking & Finance (1.00)
Energy > Renewable > Wind (0.69)
Energy > Renewable > Solar (0.68)
Energy > Power Industry > Utilities (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Auditing ML Models for Individual Bias and Unfairness

Xue, Songkai, Yurochkin, Mikhail, Sun, Yuekai

arXiv.org Machine LearningMar-10-2020

We consider the task of auditing ML models for individual bias/unfairness. We formalize the task in an optimization problem and develop a suite of inferential tools for the optimal value. Our tools permit us to obtain asymptotic confidence intervals and hypothesis tests that cover the target/control the Type I error rate exactly. To demonstrate the utility of our tools, we use them to reveal the gender and racial biases in Northpointe's COMPAS recidivism prediction instrument.

auditor, fairness, ml model, (12 more...)

arXiv.org Machine Learning

2003.05048

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry: Law (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback