Butter-Bench: Evaluating LLM Controlled Robots for Practical Intelligence

Open in new window