Evaluate Large Language Model as Critic