No Translation Needed: Forecasting Quality from Fertility and Metadata
Lundin, Jessica M., Zhang, Ada, Adelani, David, Carroll, Cody
–arXiv.org Artificial Intelligence
We show that translation quality can be predicted with surprising accuracy \textit{without ever running the translation system itself}. Using only a handful of features, token fertility ratios, token counts, and basic linguistic metadata (language family, script, and region), we can forecast ChrF scores for GPT-4o translations across 203 languages in the FLORES-200 benchmark. Gradient boosting models achieve favorable performance ($R^{2}=0.66$ for XX$\rightarrow$English and $R^{2}=0.72$ for English$\rightarrow$XX). Feature importance analyses reveal that typological factors dominate predictions into English, while fertility plays a larger role for translations into diverse target languages. These findings suggest that translation quality is shaped by both token-level fertility and broader linguistic typology, offering new insights for multilingual evaluation and quality estimation.
arXiv.org Artificial Intelligence
Sep-9-2025
- Country:
- Africa > Niger (0.05)
- Asia > Indonesia
- Bali (0.05)
- Europe > Middle East
- Malta > Eastern Region > Northern Harbour District > St. Julian's (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > California
- San Francisco County > San Francisco (0.05)
- Canada > Quebec
- Genre:
- Research Report > New Finding (0.34)
- Technology: