Misconduct in Post-Selections and Deep Learning
–arXiv.org Artificial Intelligence
This is a theoretical paper on "Deep Learning" misconduct in particular and Post-Selection in general. As far as the author knows, the first peer-reviewed papers on Deep Learning misconduct are [32], [37], [36]. Regardless of learning modes, e.g., supervised, reinforcement, adversarial, and evolutional, almost all machine learning methods (except for a few methods that train a sole system) are rooted in the same misconduct -- cheating and hiding -- (1) cheating in the absence of a test and (2) hiding bad-looking data. It was reasoned in [32], [37], [36] that authors must report at least the average error of all trained networks, good and bad, on the validation set (called general cross-validation in this paper). Better, report also five percentage positions of ranked errors. From the new analysis here, we can see that the hidden culprit is Post-Selection. This is also true for Post-Selection on hand-tuned or searched hyperparameters, because they are random, depending on random observation data. Does cross-validation on data splits rescue Post-Selections from the Misconducts (1) and (2)? The new result here says: No. Specifically, this paper reveals that using cross-validation for data splits is insufficient to exonerate Post-Selections in machine learning. In general, Post-Selections of statistical learners based on their errors on the validation set are statistically invalid.
arXiv.org Artificial Intelligence
Feb-13-2024
- Country:
- Asia
- Afghanistan (0.04)
- China
- Beijing > Beijing (0.04)
- Guangdong Province > Shenzhen (0.04)
- Hainan Province > Haikou (0.04)
- Japan > Honshū
- Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Middle East > Israel (0.04)
- North Korea (0.04)
- Russia (0.14)
- Thailand > Bangkok
- Bangkok (0.04)
- Vietnam (0.04)
- Europe
- North America > United States
- California > Alameda County
- Berkeley (0.04)
- Michigan > Ingham County
- Okemos (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- New York (0.04)
- California > Alameda County
- Asia
- Genre:
- Research Report (0.84)
- Industry:
- Health & Medicine > Therapeutic Area (0.47)
- Leisure & Entertainment (0.93)
- Technology: