CodeJudge-Eval: Can Large Language Models be Good Judges in Code Understanding?

Open in new window