Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX