rayner
Quantifying the Effects of Word Length, Frequency, and Predictability on Dyslexia
Rydel-Johnston, Hugo, Kafkas, Alex
Division of Psychology, Communication & Human Neuroscience, The University of Manchester Author Note Hugo Rydel - Johnston https://orcid.org/0009 - 0006 - 1103 - 1015 Alex Ka fkas https://orcid.org/0000 - 0001 - 5133 - 8827 We have no conflict s of interest to disclose. Correspondence concerning this article should be addressed to Hugo Rydel - Johnston, Division of Psychology, Communication & Human Neuroscience, The University of Manchester, Oxford Road, Manchester, M13 9PL, UK . DYSLEXIC READING TAKES LONGER 2 Abstract We ask where, and under what conditions, dyslexic reading costs arise in a large - scale naturalistic reading dataset. Using eye - tracking aligned to word - level properties -- word length, frequency, and predictability -- we model the influence of each of these feat ures on dyslexic time costs. We find that all three properties robustly change reading times in both typical and dyslexic readers, but dyslexic readers show stronger sensitivities to each of the three features, especially predictability. Counterfactual man ipulations of these features substantially narrow the dyslexic - control gap -- by about one - third -- with predictability showing the strongest effect, followed by length, and frequency. These patterns align with existing dyslexia theories suggesting heightened de mands on linguistic working memory and phonological encoding in dyslexic reading and directly motivate further research into lexical complexity and preview benefits to further explain the quantified gap. In effect, these findings break down when extra dysl exic costs arise, how large they are, and provide actionable guidance for the development of interventions and computational models for dyslexic readers. Keywords: e ye movements, r eading time, w ord length, l exical f requency, p redictability, s kipping, t otal reading time DYSLEXIC READING TAKES LONGER 3 Why Dyslexic Reading Takes Longer - And When Dyslexia is characterized by persistent difficulty in accurate and/or fluent word recognition and decoding (Lyon et al., 2003) and affects between 4 - 8% of individuals (Yang et al., 2022; Doust et al., 2022).
- Europe > United Kingdom (0.24)
- Europe > Germany > Saxony > Leipzig (0.04)
- Europe > Germany > Brandenburg > Potsdam (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.30)
A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading Behavior
Re, Francesco Ignazio, Opedal, Andreas, Manaiev, Glib, Giulianelli, Mario, Cotterell, Ryan
Reading is a process that unfolds across space and time, alternating between fixations where a reader focuses on a specific point in space, and saccades where a reader rapidly shifts their focus to a new point. An ansatz of psycholinguistics is that modeling a reader's fixations and saccades yields insight into their online sentence processing. However, standard approaches to such modeling rely on aggregated eye-tracking measurements and models that impose strong assumptions, ignoring much of the spatio-temporal dynamics that occur during reading. In this paper, we propose a more general probabilistic model of reading behavior, based on a marked spatio-temporal point process, that captures not only how long fixations last, but also where they land in space and when they take place in time. The saccades are modeled using a Hawkes process, which captures how each fixation excites the probability of a new fixation occurring near it in time and space. The duration time of fixation events is modeled as a function of fixation-specific predictors convolved across time, thus capturing spillover effects. Empirically, our Hawkes process model exhibits a better fit to human saccades than baselines. With respect to fixation durations, we observe that incorporating contextual surprisal as a predictor results in only a marginal improvement in the model's predictive accuracy. This finding suggests that surprisal theory struggles to explain fine-grained eye movements.
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
On the Proper Treatment of Tokenization in Psycholinguistics
Giulianelli, Mario, Malagutti, Luca, Gastaldi, Juan Luis, DuSell, Brian, Vieira, Tim, Cotterell, Ryan
Language models are widely used in computational psycholinguistics to test theories that relate the negative log probability (the surprisal) of a region of interest (a substring of characters) under a language model to its cognitive cost experienced by readers, as operationalized, for example, by gaze duration on the region. However, the application of modern language models to psycholinguistic studies is complicated by the practice of using tokenization as an intermediate step in training a model. Doing so results in a language model over token strings rather than one over character strings. Vexingly, regions of interest are generally misaligned with these token strings. The paper argues that token-level language models should be (approximately) marginalized into character-level language models before they are used in psycholinguistic studies to compute the surprisal of a region of interest; then, the marginalized character-level language model can be used to compute the surprisal of an arbitrary character substring, which we term a focal area, that the experimenter may wish to use as a predictor. Our proposal of marginalizing a token-level model into a character-level one solves this misalignment issue independently of the tokenization scheme. Empirically, we discover various focal areas whose surprisal is a better psychometric predictor than the surprisal of the region of interest itself.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
A Discriminative Model for Identifying Readers and Assessing Text Comprehension from Eye Movements
Makowski, Silvia, Jäger, Lena, Abdelwahab, Ahmed, Landwehr, Niels, Scheffer, Tobias
We study the problem of inferring readers' identities and estimating their level of text comprehension from observations of their eye movements during reading. We develop a generative model of individual gaze patterns (scanpaths) that makes use of lexical features of the fixated words. Using this generative model, we derive a Fisher-score representation of eye-movement sequences. We study whether a Fisher-SVM with this Fisher kernel and several reference methods are able to identify readers and estimate their level of text comprehension based on eye-tracking data. While none of the methods are able to estimate text comprehension accurately, we find that the SVM with Fisher kernel excels at identifying readers.
- Europe > Germany > Brandenburg > Potsdam (0.05)
- Europe > Latvia > Riga Municipality > Riga (0.05)
- North America > United States > Michigan (0.04)
- (8 more...)
Flipboard on Flipboard
Want to know if you'll be dead in five years? Just let a computer look at your organs. New research has indicated that "future" predicting computers could be coming to hospitals in the near future. Researchers are hoping that the technology could be used to predict serious illnesses and medical conditions such as heart attacks. For the study, five year–old medical images of 48 patient's chests were analyzed by artificial intelligence.
Human Reading and the Curse of Dimensionality
Whereas optical character recognition (OCR) systems learn to classify singlecharacters; people learn to classify long character strings in parallel, within a single fixation. This difference is surprising because high dimensionality is associated with poor classification learning. This paper suggests that the human reading system avoids these problems because the number of to-be-classified images isreduced by consistent and optimal eye fixation positions, and by character sequence regularities. An interesting difference exists between human reading and optical character recognition (OCR)systems. The input/output dimensionality of character classification in human reading is much greater than that for OCR systems (see Figure 1) . OCR systems classify one character at time; while the human reading system classifies as many as 8-13 characters per eye fixation (Rayner, 1979) and within a fixation, character category and sequence information is extracted in parallel (Blanchard, McConkie, Zola, and Wolverton, 1984; Reicher, 1969).
- North America > United States > Kansas (0.06)
- North America > United States > Texas > Travis County > Austin (0.04)
Human Reading and the Curse of Dimensionality
Whereas optical character recognition (OCR) systems learn to classify single characters; people learn to classify long character strings in parallel, within a single fixation. This difference is surprising because high dimensionality is associated with poor classification learning. This paper suggests that the human reading system avoids these problems because the number of to-be-classified images is reduced by consistent and optimal eye fixation positions, and by character sequence regularities. An interesting difference exists between human reading and optical character recognition (OCR) systems. The input/output dimensionality of character classification in human reading is much greater than that for OCR systems (see Figure 1). OCR systems classify one character at time; while the human reading system classifies as many as 8-13 characters per eye fixation (Rayner, 1979) and within a fixation, character category and sequence information is extracted in parallel (Blanchard, McConkie, Zola, and Wolverton, 1984; Reicher, 1969).
- North America > United States > Kansas (0.06)
- North America > United States > Texas > Travis County > Austin (0.04)
Human Reading and the Curse of Dimensionality
Whereas optical character recognition (OCR) systems learn to classify single characters; people learn to classify long character strings in parallel, within a single fixation. This difference is surprising because high dimensionality is associated with poor classification learning. This paper suggests that the human reading system avoids these problems because the number of to-be-classified images is reduced by consistent and optimal eye fixation positions, and by character sequence regularities. An interesting difference exists between human reading and optical character recognition (OCR) systems. The input/output dimensionality of character classification in human reading is much greater than that for OCR systems (see Figure 1). OCR systems classify one character at time; while the human reading system classifies as many as 8-13 characters per eye fixation (Rayner, 1979) and within a fixation, character category and sequence information is extracted in parallel (Blanchard, McConkie, Zola, and Wolverton, 1984; Reicher, 1969).
- North America > United States > Kansas (0.06)
- North America > United States > Texas > Travis County > Austin (0.04)