TheAgentCompany: Benchmarking LLMAgents on Consequential Real World Tasks

Open in new window