EduCoder: An Open-Source Annotation System for Education Transcript Data
Pan, Guanzhong, Tan, Mei, Nam, Hyunji, Langlois, Lucía, Malamut, James, Deonizio, Liliana, Demszky, Dorottya
–arXiv.org Artificial Intelligence
We introduce EduCoder, a domain-specialized tool designed to support utterance-level annotation of educational dialogue. While general-purpose text annotation tools for NLP and qualitative research abound, few address the complexities of coding education dialogue transcripts -- with diverse teacher-student and peer interactions. Common challenges include defining codebooks for complex pedagogical features, supporting both open-ended and categorical coding, and contextualizing utterances with external features, such as the lesson's purpose and the pedagogical value of the instruction. EduCoder is designed to address these challenges by providing a platform for researchers and domain experts to collaboratively define complex codebooks based on observed data. It incorporates both categorical and open-ended annotation types along with contextual materials. Additionally, it offers a side-by-side comparison of multiple annotators' responses, allowing comparison and calibration of annotations with others to improve data reliability. The system is open-source, with a demo video available.
arXiv.org Artificial Intelligence
Aug-12-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- North America > United States
- California
- Santa Clara County > Palo Alto (0.04)
- Ventura County > Thousand Oaks (0.04)
- New Mexico > Santa Fe County
- Santa Fe (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- California
- South America > Uruguay
- Asia > Middle East
- Genre:
- Instructional Material (0.93)
- Research Report (1.00)
- Industry:
- Education (1.00)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (0.68)
- Natural Language > Large Language Model (0.99)
- Machine Learning > Neural Networks
- Communications (0.93)
- Data Science (1.00)
- Software (1.00)
- Artificial Intelligence
- Information Technology