RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Open in new window