Benchmarking Cognitive Biases in Large Language Models as Evaluators