Variation in prediction accuracy due to randomness in data division and fair evaluation using interval estimation

Sep-2-2024–arXiv.org Artificial Intelligence

These studies have been accelerated by 1) the increasing sophistication of information and communication technology, 2) large-scale data obtained through longitudinal studies, etc., and 3) the opening of program codes for building predictive models using machine learning. In particular, these studies have become even more active in recent years with the advent of automated machine learning framework [4-6]. As an example, published studies have applied MLA to data from the UK Biobank large longitudinal cohort study to develop models to diagnose and predict disease onset in advance [4, 7]. Such studies have been conducted previously, and in 1988, J. W. Smith et al. applied neural networks to data collected by the National Institute of Diabetes and Digestive and Kidney Diseases from a population of Pima Indians near Phoenix, Arizona, to predict the onset of diabetes [8-11]. This dataset, called the PID dataset, is still the primary dataset used to evaluate MLA in recent years, and in 2014, a method was proposed to combine multiple prediction models to predict onset of the disease, showing a very high prediction accuracy of 0.97 [12-17]. As mentioned above, a great deal of research has been published in recent years on predictive models of disease using machine learning. However, there are issues such as inadequate reporting of prediction models and lack of external validation [18].

prediction accuracy, prediction model, predictive model, (12 more...)

arXiv.org Artificial Intelligence

Sep-2-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Minnesota > Olmsted County
    - Rochester (0.04)
  - Arizona > Maricopa County
    - Phoenix (0.24)
- Asia > Japan
  - Honshū > Tōhoku > Miyagi Prefecture > Sendai (0.04)

Genre:
- Research Report > Experimental Study (0.94)

Industry:
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:
- Information Technology
  - Modeling & Simulation (1.00)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning (1.00)
    - Ensemble Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found