Goto

Collaborating Authors

 post-selection


Misconduct in Post-Selections and Deep Learning

Weng, Juyang

arXiv.org Artificial Intelligence

This is a theoretical paper on "Deep Learning" misconduct in particular and Post-Selection in general. As far as the author knows, the first peer-reviewed papers on Deep Learning misconduct are [32], [37], [36]. Regardless of learning modes, e.g., supervised, reinforcement, adversarial, and evolutional, almost all machine learning methods (except for a few methods that train a sole system) are rooted in the same misconduct -- cheating and hiding -- (1) cheating in the absence of a test and (2) hiding bad-looking data. It was reasoned in [32], [37], [36] that authors must report at least the average error of all trained networks, good and bad, on the validation set (called general cross-validation in this paper). Better, report also five percentage positions of ranked errors. From the new analysis here, we can see that the hidden culprit is Post-Selection. This is also true for Post-Selection on hand-tuned or searched hyperparameters, because they are random, depending on random observation data. Does cross-validation on data splits rescue Post-Selections from the Misconducts (1) and (2)? The new result here says: No. Specifically, this paper reveals that using cross-validation for data splits is insufficient to exonerate Post-Selections in machine learning. In general, Post-Selections of statistical learners based on their errors on the validation set are statistically invalid.


Why Deep Learning's Performance Data Are Misleading

Weng, Juyang

arXiv.org Artificial Intelligence

This is a theoretical paper, as a companion paper of the keynote talk at the same conference AIEE 2023. In contrast to conscious learning, many projects in AI have employed so-called "deep learning" many of which seemed to give impressive performance. This paper explains that such performance data are deceptively inflated due to two misconducts: "data deletion" and "test on training set". This paper clarifies "data deletion" and "test on training set" in deep learning and why they are misconducts. A simple classification method is defined, called Nearest Neighbor With Threshold (NNWT). A theorem is established that the NNWT method reaches a zero error on any validation set and any test set using the two misconducts, as long as the test set is in the possession of the author and both the amount of storage space and the time of training are finite but unbounded like with many deep learning methods. However, many deep learning methods, like the NNWT method, are all not generalizable since they have never been tested by a true test set. Why? The so-called "test set" was used in the Post-Selection step of the training stage. The evidence that misconducts actually took place in many deep learning projects is beyond the scope of this paper.


On "Deep Learning" Misconduct

Weng, Juyang

arXiv.org Artificial Intelligence

This is a theoretical paper, as a companion paper of the plenary talk for the same conference ISAIC 2022. In contrast to the author's plenary talk in the same conference, conscious learning (Weng, 2022b; Weng, 2022c) which develops a single network for a life (many tasks), "Deep Learning" trains multiple networks for each task. Although "Deep Learning" may use different learning modes, including supervised, reinforcement and adversarial modes, almost all "Deep Learning" projects apparently suffer from the same misconduct, called "data deletion" and "test on training data". This paper establishes a theorem that a simple method called Pure-Guess Nearest Neighbor (PGNN) reaches any required errors on validation data set and test data set, including zero-error requirements, through the same misconduct, as long as the test data set is in the possession of the authors and both the amount of storage space and the time of training are finite but unbounded. The misconduct violates well-known protocols called transparency and cross-validation. The nature of the misconduct is fatal, because in the absence of any disjoint test, "Deep Learning" is clearly not generalizable.


Post Selections Using Test Sets (PSUTS) and How Developmental Networks Avoid Them

Weng, Juyang

arXiv.org Artificial Intelligence

For example, a "what" concept is "where"-invariant and a "where" concept is "what"-invariant, as explained in [55], [68]. Section IV discusses an optimal framework through which such abstractions can take place from learning simple rules during early life that enable learning of more complex rules during later life-- called scaffolding [69]. Theorem 2 leads to two observations on data fitting on a static data set: Observation 1: Any data fitting on a static data set without learning invariant concepts are nonscalable, including the n-fold cross-validation discussed below. Unfortunately, data fitting on a static data set is a norm in all ImageNet Contests [66]. Namely, the remaining subsections in this section analyze approaches that are nonscalable. For example, computer vision is not a "one-shot" pattern classification problem as argued by Li Fei-Fei et al. [19] (which was questioned in PubMed without responses), but rather a spatiotemporal problem to learn various invariant concepts present in cluttered natural scenes through autonomous attention saccades, as explained further in Observation 2. Observation 2: Learning invariant concepts seem nonscalable for any data fitting on a static data set either, because there are too many images to be labeled by hand (e.g., all pixel locations) [55], [68]. Like a human baby, any scalable machine learning methods must be conscious through which the machine learner must consciously guess concepts (i.e., not just active learning [70]) (e.g., an object type) and verify their invariance rules (e.g., the where-invariance of a what concept).