AITopics | kfcv

Collaborating Authors

kfcv

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fusion Sampling Validation in Data Partitioning for Machine Learning

Udomboso, Christopher Godwin, Sigauke, Caston, Adinya, Ini

arXiv.org Artificial IntelligenceAug-5-2025

Effective data partitioning is known to be crucial in machine learning. Traditional cross-validation methods like K-Fold Cross-Validation (KFCV) enhance model robustness but often compromise generalisation assessment due to high computational demands and extensive data shuffling. To address these issues, the integration of the Simple Random Sampling (SRS), which, despite providing representative samples, can result in non-representative sets with imbalanced data. The study introduces a hybrid model, Fusion Sampling Validation (FSV), combining SRS and KFCV to optimise data partitioning. FSV aims to minimise biases and merge the simplicity of SRS with the accuracy of KFCV. The study used three datasets of 10,000, 50,000, and 100,000 samples, generated with a normal distribution (mean 0, variance 1) and initialised with seed 42. KFCV was performed with five folds and ten repetitions, incorporating a scaling factor to ensure robust performance estimation and generalisation capability. FSV integrated a weighted factor to enhance performance and generalisation further. Evaluations focused on mean estimates (ME), variance estimates (VE), mean squared error (MSE), bias, the rate of convergence for mean estimates (ROC\_ME), and the rate of convergence for variance estimates (ROC\_VE). Results indicated that FSV consistently outperformed SRS and KFCV, with ME values of 0.000863, VE of 0.949644, MSE of 0.952127, bias of 0.016288, ROC\_ME of 0.005199, and ROC\_VE of 0.007137. FSV demonstrated superior accuracy and reliability in data partitioning, particularly in resource-constrained environments and extensive datasets, providing practical solutions for effective machine learning implementations.

artificial intelligence, machine learning, variance, (15 more...)

arXiv.org Artificial Intelligence

2508.01325

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Evaluating K-Fold Cross Validation for Transformer Based Symbolic Regression Models

Kislay, Kaustubh, Singh, Shlok, Joshi, Soham, Dutta, Rohan, Flint, Jay Shim George, Zhu, Kevin

arXiv.org Artificial IntelligenceOct-29-2024

Symbolic Regression remains an NP-Hard problem, with extensive research focusing on AI models for this task. Transformer models have shown promise in Symbolic Regression, but performance suffers with smaller datasets. We propose applying k-fold cross-validation to a transformer-based symbolic regression model trained on a significantly reduced dataset (15,000 data points, down from 500,000). This technique partitions the training data into multiple subsets (folds), iteratively training on some while validating on others. Our aim is to provide an estimate of model generalization and mitigate overfitting issues associated with smaller datasets. Results show that this process improves the model's output consistency and generalization by a relative improvement in validation loss of 53.31%. Potentially enabling more efficient and accessible symbolic regression in resource-constrained environments.

dataset, kfcv, smaller dataset, (13 more...)

arXiv.org Artificial Intelligence

2410.21896

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.64)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback