Does Typological Blinding Impede Cross-Lingual Sharing?
Bjerva, Johannes, Augenstein, Isabelle
–arXiv.org Artificial Intelligence
Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features from databases such as the World Atlas of Language Structures (WALS) are a prime candidate for this, as such data exists even for very low-resource languages. However, previous work has only found minor benefits from using typological information. Our hypothesis is that a model trained in a cross-lingual setting will pick up on typological cues from the input data, thus overshadowing the utility of explicitly using such features. We verify this hypothesis by blinding a model to typological information, and investigate how cross-lingual sharing and performance is impacted. Our model is based on a cross-lingual architecture in which the latent weights governing the sharing between languages is learnt during training. We show that (i) preventing this model from exploiting typology severely reduces performance, while a control experiment reaffirms that (ii) encouraging sharing according to typology somewhat improves performance.
arXiv.org Artificial Intelligence
Jan-28-2021
- Country:
- North America > United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Europe
- Czechia > Prague (0.04)
- Spain > Valencian Community
- Valencia Province > Valencia (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany > Saxony
- Leipzig (0.04)
- Finland > Uusimaa
- Helsinki (0.04)
- Denmark
- Capital Region > Copenhagen (0.04)
- North Jutland > Aalborg (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- Indonesia > Bali (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.46)
- Technology: