UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions
Yang, Xiulin, Ju, Zhuoxuan, Bu, Lanni, Liu, Zoey, Schneider, Nathan
–arXiv.org Artificial Intelligence
CHILDES is a widely used resource of transcribed child and child-directed speech. This paper introduces UD-English-CHILDES, the first officially released Universal Dependencies (UD) treebank. It is derived from previously dependency-annotated CHILDES data, which we harmonize to follow unified annotation principles. The gold-standard trees encompass utterances sampled from 11 children and their caregivers, totaling over 48K sentences (236K tokens). We validate these gold-standard annotations under the UD v2 framework and provide an additional 1M~silver-standard sentences, offering a consistent resource for computational and linguistic research.
arXiv.org Artificial Intelligence
Jun-19-2025
- Country:
- Europe
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Czechia > Prague (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Portugal > Lisbon
- Lisbon (0.04)
- Slovenia (0.04)
- Ukraine > Kyiv Oblast
- Kyiv (0.04)
- Bulgaria > Sofia City Province
- North America > United States
- Florida > Miami-Dade County
- Miami (0.04)
- Michigan > Washtenaw County
- Ann Arbor (0.04)
- Texas > Travis County
- Austin (0.04)
- Florida > Miami-Dade County
- Europe
- Genre:
- Research Report (0.64)
- Industry:
- Education (0.46)
- Technology: