Enhancing Human Evaluation in Machine Translation with Comparative Judgment