AITopics

2101.03108

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Steland, Ansgar, Pieters, Bart E.

Cross-Validation and Uncertainty Determination for Randomized Neural Networks with Applications to Mobile Sensors

arXiv.org Machine LearningJan-6-2021

Randomized artificial neural networks such as extreme learning machines provide an attractive and efficient method for supervised learning under limited computing ressources and green machine learning. This especially applies when equipping mobile devices (sensors) with weak artificial intelligence. Results are discussed about supervised learning with such networks and regression methods in terms of consistency and bounds for the generalization and prediction error. Especially, some recent results are reviewed addressing learning with data sampled by moving sensors leading to non-stationary and dependent samples. As randomized networks lead to random out-of-sample performance measures, we study a cross-validation approach to handle the randomness and make use of it to improve out-of-sample performance. Additionally, a computationally efficient approach to determine the resulting uncertainty in terms of a confidence interval for the mean out-of-sample prediction error is discussed based on two-stage estimation. The approach is applied to a prediction problem arising in vehicle integrated photovoltaics.

neural network, renewable energy, validation sample, (19 more...)

2101.0199

Country:

North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Solar (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.62)

Tan, Vincent W. C., Zohren, Stefan

Large Non-Stationary Noisy Covariance Matrices: A Cross-Validation Approach

arXiv.org Machine LearningDec-10-2020

We introduce a novel covariance estimator that exploits the heteroscedastic nature of financial time series by employing exponential weighted moving averages and shrinking the in-sample eigenvalues through cross-validation. Our estimator is model-agnostic in that we make no assumptions on the distribution of the random entries of the matrix or structure of the covariance matrix. Additionally, we show how Random Matrix Theory can provide guidance for automatic tuning of the hyperparameter which characterizes the time scale for the dynamics of the estimator. By attenuating the noise from both the cross-sectional and time-series dimensions, we empirically demonstrate the superiority of our estimator over competing estimators that are based on exponentially-weighted and uniformly-weighted covariance matrices.

artificial intelligence, banking & finance, eigenvalue, (18 more...)

2012.05757

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (0.63)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

#artificialintelligenceNov-19-2020, 20:31:10 GMT

Proper Model Selection through Cross Validation

So, what is cross validation? Recalling my post about model selection, where we saw that it may be necessary to split data into three different portions, one for training, one for validation (to choose among models) and eventually measure the true accuracy through the last data portion. This procedure is one viable way to choose the best among several models. Cross validation (CV) is not too different from this idea, but deals with the model training/validation in quite a smart way. For CV we use a larger combined training and validation data set, followed by a testing dataset.

artificial intelligence, machine learning, validation, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.61)

arXiv.org Machine LearningNov-19-2020

Optimizing Approximate Leave-one-out Cross-validation to Tune Hyperparameters

Burn, Ryan

For a large class of regularized models, leave-one-out cross-validation can be efficiently estimated with an approximate leave-one-out formula (ALO). We consider the problem of adjusting hyperparameters so as to optimize ALO. We derive efficient formulas to compute the gradient and hessian of ALO and show how to apply a second-order optimizer to find hyperparameters. We demonstrate the usefulness of the proposed approach by finding hyperparameters for regularized logistic regression and ridge regression on various real-world data sets.

health & medicine, hyperparameter, oncology, (17 more...)

2011.10218

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.51)

Industry: Health & Medicine > Therapeutic Area (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.63)

Bayle, Pierre, Bayle, Alexandre, Janson, Lucas, Mackey, Lester

Cross-validation Confidence Intervals for Test Error

arXiv.org Machine LearningOct-31-2020

This work develops central limit theorems for cross-validation and consistent estimators of its asymptotic variance under weak stability conditions on the learning algorithm. Together, these results provide practical, asymptotically-exact confidence intervals for $k$-fold test error and valid, powerful hypothesis tests of whether one learning algorithm has smaller $k$-fold test error than another. These results are also the first of their kind for the popular choice of leave-one-out cross-validation. In our real-data experiments with diverse learning algorithms, the resulting intervals and tests outperform the most popular alternative methods from the literature.

artificial intelligence, health & medicine, test error, (19 more...)

2007.12671

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

#artificialintelligenceOct-29-2020, 13:11:16 GMT

Cross Validation Machine Learning: K-Fold

Cross-validation is used to evaluate machine learning models on a limited data sample.It estimates the skill of a machine learning model on unseen data. The techniques creates and validates given model multiple times. We have 2–4 types of cross validation like Stratified, LOOCV, K-Fold etc. Here, we will study K-Fold technique. Let's split data 70:30, train model and test the given data-set to get accuracy.

artificial intelligence, k-fold, machine learning, (10 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.96)

#artificialintelligenceOct-19-2020, 02:50:30 GMT

Cross-validation and hyperparameter tuning

Almost every machine learning algorithm comes with a large number of settings that we, the machine learning researchers and practitioners, need to specify. These tuning knobs, the so-called hyperparameters, help us control the behavior of machine learning algorithms when optimizing for performance, finding the right balance between bias and variance. Hyperparameter tuning for performance optimization is an art in itself, and there are no hard-and-fast rules that guarantee best performance on a given dataset. In Part I and Part II, we saw different holdout and bootstrap techniques for estimating the generalization performance of a model. We learned about the bias-variance trade-off, and we computed the uncertainty of our estimates.

artificial intelligence, cross-validation and hyperparameter, machine learning, (1 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Jaeger, Byron C., Tierney, Nicholas J., Simon, Noah R.

When to Impute? Imputation before and during cross-validation

arXiv.org Machine LearningOct-1-2020

Cross-validation (CV) is a technique used to estimate generalization error for prediction models. For pipeline modeling algorithms (i.e. modeling procedures with multiple steps), it has been recommended the entire sequence of steps be carried out during each replicate of CV to mimic the application of the entire pipeline to an external testing set. While theoretically sound, following this recommendation can lead to high computational costs when a pipeline modeling algorithm includes computationally expensive operations, e.g. imputation of missing values. There is a general belief that unsupervised variable selection (i.e. ignoring the outcome) can be applied before conducting CV without incurring bias, but there is less consensus for unsupervised imputation of missing values. We empirically assessed whether conducting unsupervised imputation prior to CV would result in biased estimates of generalization error or result in poorly selected tuning parameters and thus degrade the external performance of downstream models. Results show that despite optimistic bias, the reduced variance of imputation before CV compared to imputation during each replicate of CV leads to a lower overall root mean squared error for estimation of the true external R-squared and the performance of models tuned using CV with imputation before versus during each replication is minimally different. In conclusion, unsupervised imputation before CV appears valid in certain settings and may be a helpful strategy that enables analysts to use more flexible imputation techniques without incurring high computational costs.

artificial intelligence, health & medicine, imputation, (17 more...)

2010.00718

Country: North America > United States (0.94)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

#artificialintelligenceSep-8-2020, 22:10:44 GMT

Machine Learning: Some notes about Cross-Validation

K-fold cross-validation is one of the most used cross-validation methods. In this method, k represents the number of experiments(or fold) that I want to try in order to test and train my data. For example, suppose that we want to make 5 experiments(or performance) with our data composed of 1000 records. So during the first experiment, we test or validate the first 200 records and then we train the remaining 800 records. When the first experiment is finished I obtain a certain accuracy.

artificial intelligence, experiment, machine learning, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.97)