Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving
Yu, Botao, Baker, Frazier N., Chen, Ziru, Herb, Garrett, Gou, Boyu, Adu-Ampratwum, Daniel, Ning, Xia, Sun, Huan
–arXiv.org Artificial Intelligence
To enhance large language models (LLMs) for chemistry problem solving, several LLM-based agents augmented with tools have been proposed, such as ChemCrow and Coscientist. However, their evaluations are narrow in scope, leaving a large gap in understanding the benefits of tools across diverse chemistry tasks. To bridge this gap, we develop ChemAgent, an enhanced chemistry agent over ChemCrow, and conduct a comprehensive evaluation of its performance on both specialized chemistry tasks and general chemistry questions. Surprisingly, ChemAgent does not consistently outperform its base LLMs without tools. Our error analysis with a chemistry expert suggests that: For specialized chemistry tasks, such as synthesis prediction, we should augment agents with specialized tools; however, for general chemistry questions like those in exams, agents' ability to reason correctly with chemistry knowledge matters more, and tool augmentation does not always help.
arXiv.org Artificial Intelligence
Nov-11-2024
- Country:
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.68)
- Industry:
- Health & Medicine
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area (1.00)
- Materials > Chemicals (0.93)
- Health & Medicine
- Technology: