MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration

Anugraha, David, Kuwanto, Garry, Susanto, Lucky, Wijaya, Derry Tanti, Winata, Genta Indra

arXiv.org Artificial Intelligence 

We present MetaMetrics-MT, an innovative metric designed to evaluate machine translation (MT) tasks by aligning closely with human preferences through Bayesian optimization with Gaussian Processes. MetaMetrics-MT enhances existing MT metrics by optimizing their correlation with human judgments. Our experiments on the WMT24 metric shared task dataset demonstrate that MetaMetrics-MT outperforms all existing baselines, setting a new benchmark for state-of-the-art performance in the reference-based setting. Furthermore, it achieves comparable results to leading metrics in the reference-free setting, offering greater efficiency.