Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems