Evaluating Byte and Wordpiece Level Models for Massively Multilingual Semantic Parsing
Nicosia, Massimo, Piccinno, Francesco
–arXiv.org Artificial Intelligence
Token free approaches have been successfully applied to a series of word and span level tasks. In this work, we compare a byte-level (ByT5) and a wordpiece based (mT5) sequence to sequence model on the 51 languages of the MASSIVE multilingual semantic parsing dataset. We examine multiple experimental settings: (i) zero-shot, (ii) full gold data and (iii) zero-shot with synthetic data. By leveraging a state-of-the-art label projection method for machine translated examples, we are able to reduce the gap in exact match accuracy to only 5 points with respect to a model trained on gold data from all the languages. We additionally provide insights on the cross-lingual transfer of ByT5 and show how the model compares with respect to mT5 across all parameter sizes.
arXiv.org Artificial Intelligence
Dec-14-2022
- Country:
- North America > Dominican Republic (0.04)
- Oceania > Australia
- Europe
- Italy (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Middle East > Cyprus
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia > China
- Hong Kong (0.04)
- Africa > Middle East
- Egypt > Giza Governorate > Giza (0.04)
- Genre:
- Research Report (0.50)
- Technology: