AITopics | cross-validation technique

Collaborating Authors

cross-validation technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Comparing Cluster-Based Cross-Validation Strategies for Machine Learning Model Evaluation

Spezia, Afonso Martini, Fontanari, Thomas, Recamonde-Mendoza, Mariana

arXiv.org Artificial IntelligenceAug-28-2025

Cross-validation plays a fundamental role in Machine Learning, enabling robust evaluation of model performance and preventing overestimation on training and validation data. However, one of its drawbacks is the potential to create data subsets (folds) that do not adequately represent the diversity of the original dataset, which can lead to biased performance estimates. The objective of this work is to deepen the investigation of cluster-based cross-validation strategies by analyzing the performance of different clustering algorithms through experimental comparison. Additionally, a new cross-validation technique that combines Mini Batch K-Means with class stratification is proposed. Experiments were conducted on 20 datasets (both balanced and imbalanced) using four supervised learning algorithms, comparing cross-validation strategies in terms of bias, variance, and computational cost. The technique that uses Mini Batch K-Means with class stratification outperformed others in terms of bias and variance on balanced datasets, though it did not significantly reduce computational cost. On imbalanced datasets, traditional stratified cross-validation consistently performed better, showing lower bias, variance, and computational cost, making it a safe choice for performance evaluation in scenarios with class imbalance. In the comparison of different clustering algorithms, no single algorithm consistently stood out as superior. Overall, this work contributes to improving predictive model evaluation strategies by providing a deeper understanding of the potential of cluster-based data splitting techniques and reaffirming the effectiveness of well-established strategies like stratified cross-validation. Moreover, it highlights perspectives for increasing the robustness and reliability of model evaluations, especially in datasets with clustering characteristics.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.22299

Country: South America > Brazil (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Add feedback

Top 7 cross validation techniques with Python Code - Analytics Vidhya

#artificialintelligenceNov-23-2021, 08:36:50 GMT

Not suitable for Time Series data: For Time Series data the order of the samples matter. But in Stratified Cross-Validation, samples are selected in random order. LeavePOut cross-validation is an exhaustive cross-validation technique, in which p-samples are used as the validation set and remaining n-p samples are used as the training set. Suppose we have 100 samples in the dataset. If we use p 10 then in each iteration 10 values will be used as a validation set and the remaining 90 samples as the training set. This process is repeated till the whole dataset gets divided on the validation set of p-samples and n-p training samples. All the data samples get used as both training and validation samples.

dataset, training and validation, validation, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Add feedback

Cross-Validation Techniques

#artificialintelligenceAug-30-2021, 19:02:17 GMT

Time Series Cross-Validation Method 14. Blocked Cross-Validation Method 15.

artificial intelligence, cross-validation technique, machine learning, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.98)

Add feedback

'Meta' machine learning packages in R – Towards Data Science

#artificialintelligenceJul-3-2018, 17:46:29 GMT

Scalability may also pose a critical bottleneck one should care about. Each of these meta packages deal with it at different ways.

artificial intelligence, machine learning, regression, (18 more...)

#artificialintelligence

Genre: Instructional Material (0.47)

Industry: Education (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Add feedback

How do you know if your model is going to work? Part 4: Cross-validation techniques

#artificialintelligenceApr-29-2016, 20:30:54 GMT

In this article we conclude our four part series on basic model testing. When fitting and selecting models in a data science project, how do you know that your final model is good? And how sure are you that it's better than the models that you rejected? In this concluding Part 4 of our four part mini-series "How do you know if your model is going to work?" we demonstrate cross-validation techniques. Cross validation techniques attempt to improve statistical efficiency by repeatedly splitting data into train and test and re-performing model fit and model evaluation.

artificial intelligence, cross-validation technique, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.95)

Add feedback