AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

#artificialintelligenceAug-3-2020, 03:55:23 GMT

12+ BEST Machine Learning with Python Masterclass [2020] [UPDATE] - Gift Course

Do you want to become an expert Python Developer? Get started with the Python Masterclass which consists of top 12 online tutorials to make your learning easy! This is An Ultimate Python Masterclass: Get 12 Exclusive Machine Learning Courses. This Machine Learning masterclass covers all essential concepts of Python and Machine Learning in addition to over 100 practical projects. Python was developed because the creator was frustrated by not being able to find exactly what he wanted from a programming language.

artificial intelligence, machine learning, python, (13 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)

Sukkerd, Roykrong, Simmons, Reid, Garlan, David

Tradeoff-Focused Contrastive Explanation for MDP Planning

arXiv.org Artificial IntelligenceAug-2-2020

End-users' trust in automated agents is important as automated decision-making and planning is increasingly used in many aspects of people's lives. In real-world applications of planning, multiple optimization objectives are often involved. Thus, planning agents' decisions can involve complex tradeoffs among competing objectives. It can be difficult for the end-users to understand why an agent decides on a particular planning solution on the basis of its objective values. As a result, the users may not know whether the agent is making the right decisions, and may lack trust in it. In this work, we contribute an approach, based on contrastive explanation, that enables a multi-objective MDP planning agent to explain its decisions in a way that communicates its tradeoff rationale in terms of the domain-level concepts. We conduct a human subjects experiment to evaluate the effectiveness of our explanation approach in a mobile robot navigation domain. The results show that our approach significantly improves the users' understanding, and confidence in their understanding, of the tradeoff rationale of the planning agent.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2004.1296

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > Canada (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
(2 more...)

Botsas, Themistoklis, Mason, Lachlan R., Pan, Indranil

Rule-based Bayesian regression

arXiv.org Machine LearningAug-2-2020

We introduce a novel rule-based approach for handling regression problems. The new methodology carries elements from two frameworks: (i) it provides information about the uncertainty of the parameters of interest using Bayesian inference, and (ii) it allows the incorporation of expert knowledge through rule-based systems. The blending of those two different frameworks can be particularly beneficial for various domains (e.g. engineering), where, even though the significance of uncertainty quantification motivates a Bayesian approach, there is no simple way to incorporate researcher intuition into the model. We validate our models by applying them to synthetic applications: a simple linear regression problem and two more complex structures based on partial differential equations. Finally, we review the advantages of our methodology, which include the simplicity of the implementation, the uncertainty reduction due to the added information and, in some occasions, the derivation of better point predictions, and we address limitations, mainly from the computational complexity perspective, such as the difficulty in choosing an appropriate algorithm and the added computational burden.

artificial intelligence, machine learning, regression, (20 more...)

2008.00422

Country: North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.91)

#artificialintelligenceAug-1-2020, 05:20:22 GMT

R squared Does Not Measure Predictive Capacity or Statistical Adequacy - KDnuggets

The R-squared Goodness-of-Fit measure is one of the most widely available statistics accompanying the output of regression analysis in statistical software. Perhaps partially due to its widespread availability, it is also one of the most often misunderstood ones. In a regression with a single independent variable R2 is calculated as the ratio between the variation explained by the model and the total observed variation. It is often called the coefficient of determination and can be interpreted as the proportion of variation explained by the posed predictor. In such a case, it is equivalent to the square of the correlation coefficient of the observed and fitted values of the variable.

artificial intelligence, coefficient, machine learning, (14 more...)

Country:

North America > United States > Virginia (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Genre: Research Report > Experimental Study (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Burt, David R., Rasmussen, Carl Edward, van der Wilk, Mark

Convergence of Sparse Variational Inference in Gaussian Processes Regression

arXiv.org Machine LearningAug-1-2020

Gaussian processes are distributions over functions that are versatile and mathematically convenient priors in Bayesian modelling. However, their use is often impeded for data with large numbers of observations, $N$, due to the cubic (in $N$) cost of matrix operations used in exact inference. Many solutions have been proposed that rely on $M \ll N$ inducing variables to form an approximation at a cost of $\mathcal{O}(NM^2)$. While the computational cost appears linear in $N$, the true complexity depends on how $M$ must scale with $N$ to ensure a certain quality of the approximation. In this work, we investigate upper and lower bounds on how $M$ needs to grow with $N$ to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with $M\ll N$. Specifically, for the popular squared exponential kernel and $D$-dimensional Gaussian distributed covariates, $M=\mathcal{O}((\log N)^D)$ suffice and a method with an overall computational cost of $\mathcal{O}(N(\log N)^{2D}(\log\log N)^2)$ can be used to perform inference.

approximation, artificial intelligence, machine learning, (18 more...)

2008.00323

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Cabassi, Alessandra, Seyres, Denis, Frontini, Mattia, Kirk, Paul D. W.

Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome

arXiv.org Machine LearningAug-1-2020

Summary: Building classification models that predict a binary class label on the basis of high dimensional multi-omics datasets poses several challenges, due to the typically widely differing characteristics of the data layers in terms of number of predictors, type of data, and levels of noise. Previous research has shown that applying classical logistic regression with elastic-net penalty to these datasets can lead to poor results (Liu et al., 2018). We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately and a predictive model is then built using the variables selected in the first step. Here, our approach is compared to other methods that have been developed for the same purpose, and we adapt existing software for multi-omic linear regression (Zhao and Zucknick, 2020) to the logistic regression setting. Extensive simulation studies show that our approach should be preferred if the goal is to select as many relevant predictors as possible, as well as achieving prediction performances comparable to those of the best competitors. Our motivating example is a cardiometabolic syndrome dataset comprising eight'omic data types for 2 extreme phenotype groups (10 obese and 10 lipodystrophy individuals) and 185 blood donors. Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.

artificial intelligence, machine learning, modeling & simulation, (17 more...)

2008.00235

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > Austria > Vienna (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Hematology (0.92)
(3 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceJul-31-2020, 07:12:04 GMT

Explainable-AI: Where Supervised Learning Can Falter

Disclaimer: I'll be talking mainly about logistic-regression and basic feed-forward neural networks, so its helpful to have programmed with those 2 models before reading this piece. OK -- before statisticians and ML folks come running after me after reading the title, I'm not talking about linear regression, for example. Yes, in linear regression, you can use the R-squared (or adjusted R-squared statistic) to talk about explained variance, and since linear regression only involves addition between independent variables (or predictors), they're pretty interpretable. If you were doing a linear regression to predict, say the price of a car Car_Price, based on the number of seats, mileage, maximum-speed, and battery life, your linear model could be –– say Car_Price c1*Seats c2*Mileage c3*Speed c4*Battery_Power –– the fact that variables are only added makes it pretty interpretable. But when it comes to more complex prediction models like Logistic Regression and neural networks, everything about the predictors (or called "features" in ML) becomes more confusing.

artificial intelligence, machine learning, neural network, (14 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Machine LearningJul-31-2020

A Multi-Variate Triple-Regression Forecasting Algorithm for Long-Term Customized Allergy Season Prediction

Wu, Xiaoyu, Bai, Zeyu, Liang, Youzhi

In this paper, we propose a novel multi-variate algorithm using a triple-regression methodology to predict the airborne-pollen allergy season that can be customized for each patient in the long term. To improve the prediction accuracy, we first perform a pre-processing to integrate the historical data of pollen concentration and various inferential signals from other covariates such as the meteorological data. We then propose a novel algorithm which encompasses three-stage regressions: in Stage 1, a regression model to predict the start/end date of a airborne-pollen allergy season is trained from a feature matrix extracted from 12 time series of the covariates with a rolling window; in Stage 2, a regression model to predict the corresponding uncertainty is trained based on the feature matrix and the prediction result from Stage 1; in Stage 3, a weighted linear regression model is built upon prediction results from Stage 1 and 2. It is observed and proved that Stage 3 contributes to the improved forecasting accuracy and the reduced uncertainty of the multi-variate triple-regression algorithm. Based on different allergy sensitivity level, the triggering concentration of the pollen - the definition of the allergy season can be customized individually. In our backtesting, a mean absolute error (MAE) of 4.7 days was achieved using the algorithm. We conclude that this algorithm could be applicable in both generic and long-term forecasting problems.

artificial intelligence, machine learning, prediction, (15 more...)

2005.04557

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Xu, Ning, Fisher, Timothy C. G., Hong, Jian

Rademacher upper bounds for cross-validation errors with an application to the lasso

arXiv.org Machine LearningJul-30-2020

We establish a general upper bound for $K$-fold cross-validation ($K$-CV) errors that can be adapted to many $K$-CV-based estimators and learning algorithms. Based on Rademacher complexity of the model and the Orlicz-$\Psi_{\nu}$ norm of the error process, the CV error upper bound applies to both light-tail and heavy-tail error distributions. We also extend the CV error upper bound to $\beta$-mixing data using the technique of independent blocking. We provide a Python package (\texttt{CVbound}, \url{https://github.com/isaac2math}) for computing the CV error upper bound in $K$-CV-based algorithms. Using the lasso as an example, we demonstrate in simulations that the upper bounds are tight and stable across different parameter settings and random seeds. As well as accurately bounding the CV errors for the lasso, the minimizer of the new upper bounds can be used as a criterion for variable selection. Compared with the CV-error minimizer, simulations show that tuning the lasso penalty parameter according to the minimizer of the upper bound yields a more sparse and more stable model that retains all of the relevant variables.

cv error, imsart-ao ver, tex date, (14 more...)

2007.15598

Country:

North America > United States > New York (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.62)