Goto

Collaborating Authors

 speech tagging bilingual speech transcript


Part of Speech Tagging Bilingual Speech Transcripts with Intrasentential Model Switching

AAAI Conferences

This paper investigates incremental part of speech tagging for speech transcripts that contain multilin- gual intrasentential code-mixing, and compares the accuracy of a monolithic tagging model trained on a heterogeneous-language dataset to a model that switches between two homogeneous-language tagging models dynamically using word-by-word language identification. We find that the dynamic model, even though presented a smaller context consisting of sen- tence fragments, meets the accuracy of the monolithic code-mixing model which is aware of increased context. Our system is modular, and is designed to be expanded to many-language code-mixing.