Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages
Soulos, Paul, Rao, Sudha, Smith, Caitlin, Rosen, Eric, Celikyilmaz, Asli, McCoy, R. Thomas, Jiang, Yichen, Haley, Coleman, Fernandez, Roland, Palangi, Hamid, Gao, Jianfeng, Smolensky, Paul
–arXiv.org Artificial Intelligence
The task of machine translation has seen major progress in recent times with the advent of large-scale Transformer-based models (e.g., Vaswani et al., 2017; Dehghani et al., 2019; Liu et al., 2020a). However, there has been less progress on language pairs that specifically involve morphologically rich languages. Moreover, although there has been previous work that builds linguistic structure into translation models to deal with morphological complexity (Sennrich and Haddow, 2016; Dalvi et al., 2017; Matthews et al., 2018), to the best to our knowledge there has not been work that applies such strategies to large-scale Transformer-based models. We hypothesize that providing Transformers access to structured linguistic representations can significantly boost their performance on translation into languages with complex morphology that encodes linguistic structure. In this work, we investigate two methods for introducing such structural bias into Transformer-based models. In the first method, we use the TP-Transformer (TPT) (Schlag et al., 2019), in which a traditional Transformer is augmented with Tensor Product Representations (TPRs) (Smolensky, 1990) ( 2).
arXiv.org Artificial Intelligence
Aug-11-2022
- Country:
- North America
- Canada > Nunavut (0.04)
- United States > Louisiana
- Orleans Parish > New Orleans (0.04)
- Europe
- Asia
- Middle East > Republic of Türkiye (0.14)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Japan > Honshū
- Kansai > Osaka Prefecture > Osaka (0.04)
- North America
- Genre:
- Research Report > New Finding (0.68)
- Technology: