ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback
Wu, Qinzhuo, Liu, Wei, Luan, Jian, Wang, Bin
–arXiv.org Artificial Intelligence
Recently, tool-augmented LLMs have gained increasing attention. Given an instruction, tool-augmented LLMs can interact with various external tools in multiple rounds and provide a final answer. However, previous LLMs were trained on overly detailed instructions, which included API names or parameters, while real users would not explicitly mention these API details. This leads to a gap between trained LLMs and real-world scenarios. In addition, most works ignore whether the interaction process follows the instruction. To address these issues, we constructed a training dataset called MGToolBench, which contains statement and category-level instructions to better reflect real-world scenarios. In addition, we propose ToolPlanner, a two-stage reinforcement learning framework that utilizes path planning and two feedback mechanisms to enhance the LLM's task completion and instruction-following capabilities. Experimental results show that ToolPlanner significantly improves the Match Rate, Pass Rate and Win Rate by 26.8%, 20.2%, and 5.6% compared to the SOTA model. Human evaluation verifies that the multi-granularity instructions can better align with users' usage habits. Our data and code will be released upon acceptance.
arXiv.org Artificial Intelligence
Nov-3-2024
- Country:
- North America > United States (0.68)
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Banking & Finance > Trading (0.68)
- Consumer Products & Services (0.67)
- Leisure & Entertainment (1.00)
- Technology: