Text Rewriting Improves Semantic Role Labeling
–Journal of Artificial Intelligence Research
Large-scale annotated corpora are a prerequisite to developing high-performance NLP systems. Such corpora are expensive to produce, limited in size, often demanding linguistic expertise. In this paper we use text rewriting as a means of increasing the amount of labeled data available for model training. Our method uses automatically extracted rewrite rules from comparable corpora and bitexts to generate multiple versions of sentences annotated with gold standard labels. We apply this idea to semantic role labeling and show that a model trained on rewritten data outperforms the state of the art on the CoNLL-2009 benchmark dataset.
Journal of Artificial Intelligence Research
Sep-19-2014
- Country:
- North America
- United States
- Michigan > Washtenaw County
- Ann Arbor (0.14)
- Ohio > Franklin County
- Columbus (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- New York
- New York County > New York City (0.04)
- Monroe County > Rochester (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- California > Los Angeles County
- Los Angeles (0.15)
- Georgia > Fulton County
- Atlanta (0.04)
- Washington > King County
- Seattle (0.04)
- Colorado > Boulder County
- Boulder (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Michigan > Washtenaw County
- Mexico > Mexico City
- Mexico City (0.04)
- Canada
- United States
- Europe
- Czechia > Prague (0.04)
- United Kingdom
- Scotland
- City of Edinburgh > Edinburgh (0.04)
- City of Glasgow > Glasgow (0.04)
- England > Cambridgeshire
- Cambridge (0.04)
- Scotland
- France > Occitanie
- Haute-Garonne > Toulouse (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- Asia
- South Korea (0.04)
- Singapore (0.04)
- Japan > Hokkaidō
- Hokkaidō Prefecture > Sapporo (0.04)
- China > Beijing
- Beijing (0.04)
- North America
- Genre:
- Research Report > New Finding (0.46)
- Technology: