Shen, Zhuocheng
LLM With Tools: A Survey
Shen, Zhuocheng
The integration of tools in augmenting large language models presents a novel approach toward enhancing the efficiency and accuracy of these models in handling specific, complex tasks. This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools, thereby pushing the boundaries of their capabilities beyond pre-existing knowledge bases. We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans and their execution, emphasizing the significance of understanding user intent, tool selection, and dynamic plan adjustment. Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes. In addressing these challenges, we investigate techniques within the context of fine-tuning and incontext learning paradigms, highlighting innovative approaches to ensure diversity, augment datasets, and improve generalization.Furthermore, we investigate a perspective on enabling LLMs to not only utilize but also autonomously create tools, which may redefine their role from mere tool users to tool creators. Finally,we reproduced Chameleon's results on ScienceQA and analyzed the code structure.
LLMs-based Few-Shot Disease Predictions using EHR: A Novel Approach Combining Predictive Agent Reasoning and Critical Agent Instruction
Cui, Hejie, Shen, Zhuocheng, Zhang, Jieyu, Shao, Hui, Qin, Lianhui, Ho, Joyce C., Yang, Carl
Electronic health records (EHRs) contain valuable patient data for health-related prediction tasks, such as disease prediction. Traditional approaches rely on supervised learning methods that require large labeled datasets, which can be expensive and challenging to obtain. In this study, we investigate the feasibility of applying Large Language Models (LLMs) to convert structured patient visit data (e.g., diagnoses, labs, prescriptions) into natural language narratives. We evaluate the zero-shot and few-shot performance of LLMs using various EHR-prediction-oriented prompting strategies. Furthermore, we propose a novel approach that utilizes LLM agents with different roles: a predictor agent that makes predictions and generates reasoning processes and a critic agent that analyzes incorrect predictions and provides guidance for improving the reasoning of the predictor agent. Our results demonstrate that with the proposed approach, LLMs can achieve decent few-shot performance compared to traditional supervised learning methods in EHR-based disease predictions, suggesting its potential for health-oriented applications. Introduction Large Language Models (LLMs) have emerged as a powerful tool in various domains, including healthcare.