AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Jun-23-2026, 07:25:34 GMT–Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated advanced capabilities in realworld agentic applications. Growing research efforts aim to develop LLM-based agents to address practical demands, introducing a new challenge: agentic scenarios often involve lengthy instructions with complex constraints, such as extended system prompts and detailed tool specifications. While adherence to such instructions is crucial for agentic applications, whether LLMs can reliably follow them remains underexplored. In this paper, we introduce AGENTIF, the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios. AGENTIF features three key characteristics: (1) Realistic, constructed from 50 real-world agentic applications.

constraint, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Jun-23-2026, 07:25:34 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Information Technology (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found