Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
Bentivogli, Luisa, Savoldi, Beatrice, Negri, Matteo, Di Gangi, Mattia Antonino, Cattoni, Roldano, Turchi, Marco
–arXiv.org Artificial Intelligence
Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines. This difficulty is also due to the fact that the training data on which models are built typically reflect the asymmetries of natural languages, gender bias included. Exclusively fed with textual data, machine translation is intrinsically constrained by the fact that the input sentence does not always contain clues about the gender identity of the referred human entities. But what happens with speech translation, where the input is an audio signal? Can audio provide additional information to reduce gender bias? We present the first thorough investigation of gender bias in speech translation, contributing with: i) the release of a benchmark useful for future studies, and ii) the comparison of different technologies (cascade and end-to-end) on two language directions (English-Italian/French).
arXiv.org Artificial Intelligence
Jun-10-2020
- Country:
- Oceania > Australia
- Queensland > Brisbane (0.04)
- North America
- United States
- New York (0.04)
- Hawaii (0.04)
- Pennsylvania
- Philadelphia County > Philadelphia (0.04)
- Allegheny County > Pittsburgh (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California
- Los Angeles County > Long Beach (0.04)
- San Francisco County > San Francisco (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- United Kingdom > England
- Cambridgeshire > Cambridge (0.14)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Italy
- Tuscany > Florence (0.05)
- Trentino-Alto Adige/Südtirol > Trentino Province
- Trento (0.04)
- Germany > Baden-Württemberg
- Karlsruhe Region > Heidelberg (0.04)
- France > Hauts-de-France
- Denmark > Capital Region
- Copenhagen (0.04)
- Belgium
- Brussels-Capital Region > Brussels (0.04)
- Flanders > West Flanders
- Bruges (0.04)
- Austria > Styria
- Graz (0.04)
- United Kingdom > England
- Asia
- China > Hong Kong (0.04)
- Japan
- Kyūshū & Okinawa > Kyūshū
- Miyazaki Prefecture > Miyazaki (0.04)
- Honshū > Kantō
- Tokyo Metropolis Prefecture > Tokyo (0.14)
- Chiba Prefecture > Chiba (0.04)
- Kyūshū & Okinawa > Kyūshū
- India
- Maharashtra > Mumbai (0.05)
- Telangana > Hyderabad (0.04)
- Africa > Middle East
- Algeria > Algiers Province > Algiers (0.04)
- Oceania > Australia
- Genre:
- Research Report (0.82)
- Technology: