Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
Farinhas, António, Guerreiro, Nuno M., Agrawal, Sweta, Rei, Ricardo, Martins, André F. T.
–arXiv.org Artificial Intelligence
Larger models often outperform smaller ones but come with high computational costs. Cascading offers a potential solution. By default, it uses smaller models and defers only some instances to larger, more powerful models. However, designing effective deferral rules remains a challenge. In this paper, we propose a simple yet effective approach for machine translation, using existing quality estimation (QE) metrics as deferral rules. We show that QE-based deferral allows a cascaded system to match the performance of a larger model while invoking it for a small fraction (30% to 50%) of the examples, significantly reducing computational costs. We validate this approach through both automatic and human evaluation.
arXiv.org Artificial Intelligence
Feb-18-2025
- Country:
- Asia
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.05)
- Singapore (0.05)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > UAE
- Europe
- North America > United States
- Florida > Miami-Dade County
- Miami (0.04)
- New Mexico (0.04)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Florida > Miami-Dade County
- Asia
- Genre:
- Research Report > Promising Solution (0.34)
- Technology: