CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Open in new window