AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.70)

Stephenson, William T., Frangella, Zachary, Udell, Madeleine, Broderick, Tamara

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

arXiv.org Machine LearningJul-19-2021

Models like LASSO and ridge regression are extensively used in practice due to their interpretability, ease of use, and strong theoretical guarantees. Cross-validation (CV) is widely used for hyperparameter tuning in these models, but do practical optimization methods minimize the true out-of-sample loss? A recent line of research promises to show that the optimum of the CV loss matches the optimum of the out-of-sample loss (possibly after simple corrections). It remains to show how tractable it is to minimize the CV loss. In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima. We can guarantee that the CV loss is quasiconvex in at least one case: when the spectrum of the covariate matrix is nearly flat and the noise in the observed responses is not too high. More generally, we show that quasiconvexity status is independent of many properties of the observed data (response norm, covariate-matrix right singular vectors and singular-value scaling) and has a complex dependence on the few that remain. We empirically confirm our theory using simulated experiments.

artificial intelligence, optimization problem, quasiconvexity, (18 more...)

2107.09194

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

#artificialintelligenceJun-11-2021, 13:31:00 GMT

Leave-One-Out Cross-Validation

It's one of the technique in which we implement KFold cross-validation, where k is equal to n i.e the number of observations in the data. Thus, every single point will be used in a validation set, we will create n models, for n-observations in the data. Each point/sample is used once as a test set while the remaining data/samples form the training set. The scikit-learn Python machine learning library provides an implementation of the LOOCV via the LeaveOneOut class using Leave-One-Out cross-validator.

artificial intelligence, leave-one-out cross-validation, machine learning

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.74)

Jankowiak, Martin, Pleiss, Geoff

Scalable Cross Validation Losses for Gaussian Process Models

arXiv.org Machine LearningMay-24-2021

We introduce a simple and scalable method for training Gaussian process (GP) models that exploits cross-validation and nearest neighbor truncation. To accommodate binary and multi-class classification we leverage P\`olya-Gamma auxiliary variables and variational inference. In an extensive empirical comparison with a number of alternative methods for scalable GP regression and classification, we find that our method offers fast training and excellent predictive performance. We argue that the good predictive performance can be traced to the non-parametric nature of the resulting predictive distributions as well as to the cross-validation loss, which provides robustness against model mis-specification.

artificial intelligence, dataset, machine learning, (16 more...)

2105.11535

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)

#artificialintelligenceMay-15-2021, 19:25:06 GMT

20x times faster Grid Search Cross-Validation

To train a robust machine learning model, one must select the correct machine learning algorithm with the correct combination of hyperparameters. The process of choosing the optimal set of parameters is known as hyperparameter tuning. One must train the dataset on all machine learning algorithms and on a different combination of its hyperparameters to improve the performance metric. The cross-validation technique can be used to train the dataset on various machine learning algorithms and choose the best out of it. Cross-Validation is a resampling technique that can be used to evaluate and select machine learning algorithms on a limited dataset.

artificial intelligence, machine learning, search cv, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.90)

Muyskens, Amanda, Priest, Benjamin, Goumiri, Imène, Schneider, Michael

MuyGPs: Scalable Gaussian Process Hyperparameter Estimation Using Local Cross-Validation

arXiv.org Machine LearningApr-29-2021

Gaussian processes (GPs) are non-linear probabilistic models popular in many applications. However, na\"ive GP realizations require quadratic memory to store the covariance matrix and cubic computation to perform inference or evaluate the likelihood function. These bottlenecks have driven much investment in the development of approximate GP alternatives that scale to the large data sizes common in modern data-driven applications. We present in this manuscript MuyGPs, a novel efficient GP hyperparameter estimation method. MuyGPs builds upon prior methods that take advantage of the nearest neighbors structure of the data, and uses leave-one-out cross-validation to optimize covariance (kernel) hyperparameters without realizing a possibly expensive likelihood. We describe our model and methods in detail, and compare our implementations against the state-of-the-art competitors in a benchmark spatial statistics problem. We show that our method outperforms all known competitors both in terms of time-to-solution and the root mean squared error of the predictions.

artificial intelligence, machine learning, prediction, (16 more...)

2104.14581

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.93)

Industry:

Energy (0.95)
Government > Regional Government > North America Government > United States Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.62)

Bates, Stephen, Hastie, Trevor, Tibshirani, Robert

Cross-validation: what does it estimate and how well does it do it?

arXiv.org Machine LearningApr-14-2021

When deploying a predictive model, it is important to understand its prediction accuracy on future test points, so both good point estimates and accurate confidence intervals for prediction error are essential. Cross-validation (CV) is a widely-used approach for these two tasks, but in spite of its seeming simplicity, its operating properties remain opaque. Considering first estimation, it turns out be challenging to precisely state the estimand corresponding to the cross-validation point estimate. In this work, we show that the the estimand of CV is not the accuracy of the model fit on the data at hand, but is instead the average accuracy over many hypothetical data sets. Specifically, we show that the CV estimate of error has larger mean squared error (MSE) when estimating the prediction error of the final model than when estimating the average prediction error of models across many unseen data sets for the special case of linear regression. Turning to confidence intervals for prediction error, we show that naïve intervals based on CV can fail badly, giving coverage far below the nominal level; we provide a simple example soon in Section 1.1. The source of this behavior is the estimation of the variance used to compute the width of the interval: it does not account for the correlation between the error estimates in different folds, which arises because each data point is used for both training and testing. As a result, the estimate of variance is too small and the intervals are too narrow. To address this issue, we develop a modification of cross-validation, nested cross-validation (NCV), that achieves coverage near the nominal level, even in challenging cases where the usual cross-validation intervals have miscoverage rates two to three times larger than the nominal rate.

health & medicine, modeling & simulation, prediction error, (17 more...)

2104.00673

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Tang, Yu-Hang, Zhu, Yuanran, de Jong, Wibe A.

Detecting Label Noise via Leave-One-Out Cross Validation

arXiv.org Machine LearningMar-21-2021

We present a simple algorithm for identifying and correcting real-valued noisy labels from a mixture of clean and corrupted samples using Gaussian process regression. A heteroscedastic noise model is employed, in which additive Gaussian noise terms with independent variances are associated with each and all of the observed labels. Thus, the method effectively applies a sample-specific Tikhonov regularization term, generalizing the uniform regularization prevalent in standard Gaussian process regression. Optimizing the noise model using maximum likelihood estimation leads to the containment of the GPR model's predictive error by the posterior standard deviation in leave-one-out cross-validation. A multiplicative update scheme is proposed for solving the maximum likelihood estimation problem under non-negative constraints. While we provide a proof of monotonic convergence for certain special cases, the multiplicative scheme has empirically demonstrated monotonic convergence behavior in virtually all our numerical experiments. We show that the presented method can pinpoint corrupted samples and lead to better regression models when trained on synthetic and real-world scientific data sets.

artificial intelligence, machine learning, multiplicative update scheme, (13 more...)

2103.11352

Country: North America > United States > California > Merced County > Merced (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

#artificialintelligenceMar-17-2021, 12:53:02 GMT

Different Data Splitting Cross-Validation Strategies with Python

In this article, we will cover the cross-validation methods to split the data set uniformly to get good performance on prediction. We see how our data is splitting into the training set and testing set in our machine learning algorithm. But, if you ever tried to think that these two sets are enough to build the production model. From my point of view, we should include the validation set before we predict the test set. It is important because if the model gets overfit then we can tuning the hyperparameters after checking with the validation set and set the good parameter for our test set.

artificial intelligence, machine learning, python, (9 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.67)

Cronie, Ottmar, Moradi, Mehdi, Biscio, Christophe A. N.

Statistical learning and cross-validation for point processes

arXiv.org Machine LearningMar-1-2021

This paper presents the first general (supervised) statistical learning framework for point processes in general spaces. Our approach is based on the combination of two new concepts, which we define in the paper: i) bivariate innovations, which are measures of discrepancy/prediction-accuracy between two point processes, and ii) point process cross-validation (CV), which we here define through point process thinning. The general idea is to carry out the fitting by predicting CV-generated validation sets using the corresponding training sets; the prediction error, which we minimise, is measured by means of bivariate innovations. Having established various theoretical properties of our bivariate innovations, we study in detail the case where the CV procedure is obtained through independent thinning and we apply our statistical learning methodology to three typical spatial statistical settings, namely parametric intensity estimation, non-parametric intensity estimation and Papangelou conditional intensity fitting. Aside from deriving theoretical properties related to these cases, in each of them we numerically show that our statistical learning approach outperforms the state of the art in terms of mean (integrated) squared error.

artificial intelligence, point process, survey article, (17 more...)

2103.01356

Country:

North America > United States (0.45)
Europe > Sweden (0.27)

Genre:

Research Report (1.00)
Overview (0.92)
Instructional Material > Course Syllabus & Notes (0.92)

Industry: Education > Educational Setting > Higher Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)