Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
Farinhas, António, Guerreiro, Nuno M., Agrawal, Sweta, Rei, Ricardo, Martins, André F. T.
–arXiv.org Artificial Intelligence
Larger models often outperform smaller ones but come with high computational costs. Cascading offers a potential solution. By default, it uses smaller models and defers only some instances to larger, more powerful models. However, designing effective deferral rules remains a challenge. In this paper, we propose a simple yet effective approach for machine translation, using existing quality estimation (QE) metrics as deferral rules. We show that QE-based deferral allows a cascaded system to match the performance of a larger model while invoking it for a small fraction (30% to 50%) of the examples, significantly reducing computational costs. We validate this approach through both automatic and human evaluation.
arXiv.org Artificial Intelligence
Feb-18-2025
- Country:
- Asia (0.69)
- Europe (1.00)
- North America > United States (0.94)
- Genre:
- Research Report > Promising Solution (0.34)
- Technology: