Are UD Treebanks Getting More Consistent? A Report Card for English UD

Zeldes, Amir, Schneider, Nathan

arXiv.org Artificial Intelligence 

We therefore consider it timely to ask encompass not only over 100 languages, but also whether even the largest, most actively developed over 200 treebanks, meaning several languages now UD treebanks for English are actually compatible; have multiple treebanks with rich morphosyntactic if not, to what extent, and are they inching closer and other annotations. Multiple treebanks are especially together or drifting apart from version to version? common for high resource languages such Regardless of the answer to these questions, is it a as English, which currently has data in 9 different good idea to train jointly on EWT and GUM, and if repositories, totaling over 762,000 tokens (as of so, given constant revisions to the data, since what UD v2.11). While this abundance of resources is UD version? of course positive, it opens questions about consistency across multiple UD treebanks of the same

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found