Learning to Read through Machine Teaching
Sen, Ayon, Cox, Christopher R., Borkenhagen, Matthew Cooper, Seidenberg, Mark S., Zhu, Xiaojin
Learning to read words aloud is a major step towards becoming a reader. Many children struggle with the task because of the inconsistencies of English spelling-sound correspondences. Curricula vary enormously in how these patterns are taught. Children are nonetheless expected to master the system in limited time (by grade 4). We used a cognitively interesting neural network architecture to examine whether the sequence of learning trials could be structured to facilitate learning. This is a hard combinatorial optimization problem even for a modest number of learning trials (e.g., 10K). We show how this sequence optimization problem can be posed as optimizing over a time varying distribution i.e., defining probability distributions over words at different steps in training. We then use stochastic gradient descent to find an optimal time-varying distribution and a corresponding optimal training sequence. We observed significant improvement on generalization accuracy compared to baseline conditions (random sequences; sequences biased by word frequency). These findings suggest an approach to improving learning outcomes in domains where performance depends on ability to generalize beyond limited training experience.
Jul-2-2020
- Country:
- Asia > China (0.04)
- North America > United States
- New York (0.04)
- Wisconsin > Dane County
- Madison (0.15)
- New Jersey > Bergen County
- Mahwah (0.04)
- Louisiana > East Baton Rouge Parish
- Baton Rouge (0.14)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Technology: