AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Open in new window