Auto-Eval Judge: Towards a General Agentic Framework for Task Completion Evaluation

Open in new window