AITopics | chemistry task

comprehensive benchmark on eight tasks

Neural Information Processing SystemsFeb-16-2026, 19:28:53 GMT

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.93)
North America > Dominican Republic (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Pakistan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

Neural Information Processing SystemsDec-26-2025, 15:42:02 GMT

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering. However, the capability of LLMs to advance the field of chemistry remains unclear. In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain. We identify three key chemistry-related capabilities including understanding, reasoning and explaining to explore in LLMs and establish a benchmark containing eight chemistry tasks. Our analysis draws on widely recognized datasets facilitating a broad exploration of the capacities of LLMs within the context of practical chemistry.

chemistry task, comprehensive benchmark, language model, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

comprehensive benchmark on eight tasks

Neural Information Processing SystemsOct-9-2025, 05:58:40 GMT

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.93)
North America > Dominican Republic (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Pakistan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.80)

Add feedback

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

Li, Zhucong, Zhang, Bowei, Xiao, Jin, Zhou, Zhijian, Cao, Fenglei, Liang, Jiaqing, Qi, Yuan

arXiv.org Artificial IntelligenceJun-19-2025

Large Language Model (LLM)-based agents have demonstrated the ability to improve performance in chemistry-related tasks by selecting appropriate tools. However, their effectiveness remains limited by the inherent prediction errors of chemistry tools. In this paper, we take a step further by exploring how LLMbased agents can, in turn, be leveraged to reduce prediction errors of the tools. To this end, we propose ChemHAS (Chemical Hierarchical Agent Stacking), a simple yet effective method that enhances chemistry tools through optimizing agent-stacking structures from limited data. ChemHAS achieves state-of-the-art performance across four fundamental chemistry tasks, demonstrating that our method can effectively compensate for prediction errors of the tools. Furthermore, we identify and characterize four distinct agent-stacking behaviors, potentially improving interpretability and revealing new possibilities for AI agent applications in scientific research. Our code and dataset are publicly available at https: //anonymous.4open.science/r/ChemHAS-01E4/README.md.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.21569

Country:

North America (0.28)
Asia (0.28)

Genre:

Overview (0.67)
Research Report (0.66)

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

Neural Information Processing SystemsJan-19-2025, 20:45:26 GMT

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering. However, the capability of LLMs to advance the field of chemistry remains unclear. In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain. We identify three key chemistry-related capabilities including understanding, reasoning and explaining to explore in LLMs and establish a benchmark containing eight chemistry tasks. Our analysis draws on widely recognized datasets facilitating a broad exploration of the capacities of LLMs within the context of practical chemistry.

artificial intelligence, large language model, natural language, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

Yu, Botao, Baker, Frazier N., Chen, Ziqi, Ning, Xia, Sun, Huan

arXiv.org Artificial IntelligenceFeb-17-2024

Chemistry plays a crucial role in many domains, such as drug discovery and material science. While large language models (LLMs) such as GPT-4 exhibit remarkable capabilities on natural language processing tasks, existing work shows their performance on chemistry tasks is discouragingly low. In this paper, however, we demonstrate that our developed LLMs can achieve very strong results on a comprehensive set of chemistry tasks, outperforming the most advanced GPT-4 across all the tasks by a substantial margin and approaching the SoTA task-specific models. The key to our success is a large-scale, comprehensive, high-quality dataset for instruction tuning named SMolInstruct. It contains 14 meticulously selected chemistry tasks and over three million high-quality samples, laying a solid foundation for training and evaluating LLMs for chemistry. Based on SMolInstruct, we fine-tune a set of open-source LLMs, among which, we find that Mistral serves as the best base model for chemistry tasks. We further conduct analysis on the impact of trainable parameters, providing insights for future research.

gpt-4, llm, molecule, (15 more...)

arXiv.org Artificial Intelligence

2402.09391

Country: North America > United States > Ohio (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.95)
Health & Medicine > Therapeutic Area > Immunology (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

Guo, Taicheng, Guo, Kehan, Nan, Bozhao, Liang, Zhenwen, Guo, Zhichun, Chawla, Nitesh V., Wiest, Olaf, Zhang, Xiangliang

arXiv.org Artificial IntelligenceDec-27-2023

Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged and have been applied in various kinds of areas such as science, finance and software engineering. However, the capability of LLMs to advance the field of chemistry remains unclear. In this paper, rather than pursuing state-of-the-art performance, we aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain. We identify three key chemistry-related capabilities including understanding, reasoning and explaining to explore in LLMs and establish a benchmark containing eight chemistry tasks. Our analysis draws on widely recognized datasets facilitating a broad exploration of the capacities of LLMs within the context of practical chemistry. Five LLMs (GPT-4, GPT-3.5, Davinci-003, Llama and Galactica) are evaluated for each chemistry task in zero-shot and few-shot in-context learning settings with carefully selected demonstration examples and specially crafted prompts. Our investigation found that GPT-4 outperformed other models and LLMs exhibit different competitive levels in eight chemistry tasks. In addition to the key findings from the comprehensive benchmark analysis, our work provides insights into the limitation of current LLMs and the impact of in-context learning settings on LLMs' performance across various chemistry tasks. The code and datasets used in this study are available at https://github.com/ChemFoundationModels/ChemLLMBench.

gpt model, llm, molecule, (15 more...)

arXiv.org Artificial Intelligence

2305.18365

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Dominican Republic (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Pakistan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.46)

Industry:

Law (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

chemistry task

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

comprehensive benchmark on eight tasks

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

comprehensive benchmark on eight tasks

ChemHAS: Hierarchical Agent Stacking for Enhancing Chemistry Tools

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks

LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks