Are Large Reasoning Models Good Translation Evaluators? Analysis and Performance Boost