Human Reading and the Curse of Dimensionality

Neural Information Processing Systems 

Whereas optical character recognition (OCR) systems learn to clas(cid:173) sify single characters; people learn to classify long character strings in parallel, within a single fixation . This difference is surprising because high dimensionality is associated with poor classification learning. This paper suggests that the human reading system avoids these problems because the number of to-be-classified im(cid:173) ages is reduced by consistent and optimal eye fixation positions, and by character sequence regularities. An interesting difference exists between human reading and optical character recog(cid:173) nition (OCR) systems. The input/output dimensionality of character classification in human reading is much greater than that for OCR systems (see Figure 1) .