Eye of Judgement: Dissecting the Evaluation of Russian-speaking LLMs with POLLUX

Open in new window