On scalable oversight with weak LLMs judging strong LLMs

Open in new window