toolkengpt
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (3 more...)
- Workflow (0.67)
- Overview (0.46)
- Research Report (0.46)
- Leisure & Entertainment (0.93)
- Media > Film (0.46)
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Integrating large language models (LLMs) with various tools has led to increased attention in the field. Existing approaches either involve fine-tuning the LLM, which is both computationally costly and limited to a fixed set of tools, or prompting LLMs by in-context tool demonstrations. Although the latter method offers adaptability to new tools, it struggles with the inherent context length constraint of LLMs when many new tools are presented, and mastering a new set of tools with few-shot examples remains challenging, resulting in suboptimal performance.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (3 more...)
- Workflow (0.67)
- Overview (0.46)
- Research Report (0.46)
- Leisure & Entertainment (0.93)
- Media > Film (0.46)
Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models
Wu, Mengsong, Zhu, Tong, Han, Han, Zhang, Xiang, Shao, Wenbiao, Chen, Wenliang
Tool learning can further broaden the usage scenarios of large language models (LLMs). However most of the existing methods either need to finetune that the model can only use tools seen in the training data, or add tool demonstrations into the prompt with lower efficiency. In this paper, we present a new Tool Learning method Chain-of-Tools. It makes full use of the powerful semantic representation capability of frozen LLMs to finish tool calling in CoT reasoning with a huge and flexible tool pool which may contain unseen tools. Especially, to validate the effectiveness of our approach in the massive unseen tool scenario, we construct a new dataset SimpleToolQuestions. We conduct experiments on two numerical reasoning benchmarks (GSM8K-XL and FuncQA) and two knowledge-based question answering benchmarks (KAMEL and SimpleToolQuestions). Experimental results show that our approach performs better than the baseline. We also identify dimensions of the model output that are critical in tool selection, enhancing the model interpretability. Our code and data are available at: https://github.com/fairyshine/Chain-of-Tools .
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Integrating large language models (LLMs) with various tools has led to increased attention in the field. Existing approaches either involve fine-tuning the LLM, which is both computationally costly and limited to a fixed set of tools, or prompting LLMs by in-context tool demonstrations. Although the latter method offers adaptability to new tools, it struggles with the inherent context length constraint of LLMs when many new tools are presented, and mastering a new set of tools with few-shot examples remains challenging, resulting in suboptimal performance. To address these limitations, we propose a novel solution, named ToolkenGPT, wherein LLMs effectively learn to master tools as predicting tokens through tool embeddings for solving complex tasks. In this framework, each tool is transformed into vector embeddings and plugged into the language model head.
Meta-Reasoning Improves Tool Use in Large Language Models
External tools help large language models (LLMs) succeed at tasks where they would otherwise typically fail. In existing frameworks, LLMs learn tool use either by in-context demonstrations or via full model fine-tuning on annotated data. As these approaches do not easily scale, a recent trend is to abandon them in favor of lightweight, parameter-efficient tuning paradigms. These methods allow quickly alternating between the frozen LLM and its specialised fine-tuned version, by switching on or off a handful of additional custom parameters. Hence, we postulate that the generalization ability of the frozen model can be leveraged to improve tool selection. We present Tool selECTion via meta-reasONing (TECTON), a two-phase system that first reasons over a task using a custom fine-tuned LM head and outputs candidate tools. Then, with the custom head disabled, it meta-reasons (i.e., it reasons over the previous reasoning process) to make a final choice. We show that TECTON results in substantial gains - both in-distribution and out-of-distribution - on a range of math reasoning datasets.
- North America > United States > California (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Germany > Berlin (0.04)
- (3 more...)
Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option
Yakovlev, Konstantin, Nikolenko, Sergey, Bout, Andrey
The recently proposed ToolkenGPT tool learning paradigm demonstrates promising performance but suffers from two major issues: first, it cannot benefit from tool documentation, and second, it often makes mistakes in whether to use a tool at all. We introduce Toolken+ that mitigates the first problem by reranking top $k$ tools selected by ToolkenGPT and the second problem with a special "Reject" option such that the model will generate a vocabulary token if "Reject" is ranked first. We demonstrate the effectiveness of Toolken+ on multistep numerical reasoning and tool selection tasks.
- Asia > Russia (0.14)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
- North America > Canada > Ontario > Toronto (0.04)
- (2 more...)
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Hao, Shibo, Liu, Tianyang, Wang, Zhen, Hu, Zhiting
Augmenting large language models (LLMs) with external tools has emerged as a promising approach to solving complex problems. However, traditional methods, which finetune LLMs with tool demonstration data, can be both costly and restricted to a predefined set of tools. Recent in-context learning paradigm alleviates these issues, but the limited context length only allows for a few shots of demonstrations, leading to suboptimal understandings of the tools. Moreover, when there are numerous tools to choose from, in-context learning could completely fail to work. In this paper, we propose an alternative approach, $\textbf{ToolkenGPT}$, which combines the benefits of both sides. Our approach represents each $\underline{tool}$ as a to$\underline{ken}$ ($\textit{toolken}$) and learns an embedding for it, enabling tool calls in the same way as generating a regular word token. Once a toolken is triggered, the LLM is prompted to complete arguments for the tool to execute. ToolkenGPT offers the flexibility to plug in an arbitrary number of tools by expanding the set of toolkens on the fly. In addition, it improves tool use by allowing extensive demonstration data for learning the toolken embeddings. In diverse domains, including numerical reasoning, knowledge-based question answering, and embodied plan generation, our approach effectively augments LLMs with tools and substantially outperforms various latest baselines. ToolkenGPT demonstrates the promising ability to use relevant tools from a large tool set in complex scenarios.
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > New York (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
- Leisure & Entertainment (1.00)
- Media > Film (0.68)