AITopics | gimlet

GIMLET: AUnified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

Neural Information Processing SystemsApr-25-2026, 02:07:40 GMT

Molecule property prediction has gained significant attention in recent years. The main bottleneck is the label insufficiency caused by expensive lab experiments. In order to alleviate this issue and to better leverage textual knowledge for tasks, this study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting. We discover that existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs. To overcome these issues, we propose GIMLET, which unifies language models for both graph and text data. By adopting generalized position embedding, our model is extended to encode both graph structures and instruction text without additional graph encoding modules.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

129033c7c08be683059559e8d6bfd460-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 00:14:01 GMT

dataset, instruction, language model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

Neural Information Processing SystemsDec-23-2025, 23:17:57 GMT

Molecule property prediction has gained significant attention in recent years. The main bottleneck is the label insufficiency caused by expensive lab experiments. In order to alleviate this issue and to better leverage textual knowledge for tasks, this study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting. We discover that existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs. To overcome these issues, we propose GIMLET, which unifies language models for both graph and text data. By adopting generalized position embedding, our model is extended to encode both graph structures and instruction text without additional graph encoding modules.

gimlet, instruction-based molecule zero-shot learning, unified graph-text model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)

Add feedback

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

Neural Information Processing SystemsOct-9-2024, 21:00:02 GMT

Molecule property prediction has gained significant attention in recent years. The main bottleneck is the label insufficiency caused by expensive lab experiments. In order to alleviate this issue and to better leverage textual knowledge for tasks, this study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting. We discover that existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs. To overcome these issues, we propose GIMLET, which unifies language models for both graph and text data.

gimlet, instruction-based molecule zero-shot learning, unified graph-text model, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction

Liu, Yuyan, Ding, Sirui, Zhou, Sheng, Fan, Wenqi, Tan, Qiaoyu

arXiv.org Artificial IntelligenceJun-18-2024

Molecular property prediction (MPP) is a fundamental and crucial task in drug discovery. However, prior methods are limited by the requirement for a large number of labeled molecules and their restricted ability to generalize for unseen and new tasks, both of which are essential for real-world applications. To address these challenges, we present MolecularGPT for few-shot MPP. From a perspective on instruction tuning, we fine-tune large language models (LLMs) based on curated molecular instructions spanning over 1000 property prediction tasks. This enables building a versatile and specialized LLM that can be adapted to novel MPP tasks without any fine-tuning through zero- and few-shot in-context learning (ICL). MolecularGPT exhibits competitive in-context reasoning capabilities across 10 downstream evaluation datasets, setting new benchmarks for few-shot molecular prediction tasks. More importantly, with just two-shot examples, MolecularGPT can outperform standard supervised graph neural network methods on 4 out of 7 datasets. It also excels state-of-the-art LLM baselines by up to 16.6% increase on classification accuracy and decrease of 199.17 on regression metrics (e.g., RMSE) under zero-shot. This study demonstrates the potential of LLMs as effective few-shot molecular property predictors. The code is available at https://github.com/NYUSHCS/MolecularGPT.

dataset, instruction, molecule, (15 more...)

arXiv.org Artificial Intelligence

2406.1295

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.73)
Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Lifelike Illusions of A.I.

The New YorkerMar-19-2024, 10:00:00 GMT

In January, 1999, the Washington Post reported that the National Security Agency had issued a memo on its intranet with the subject "Furby Alert." According to the Post, the memo decreed that employees were prohibited from bringing to work any recording devices, including "toys, such as'Furbys,' with built-in recorders that repeat the audio with synthesized sound." That holiday season, the Furby, an animatronic toy resembling a small owl, had been a retail sensation; nearly two million were sold by year's end. They were now banned from N.S.A. headquarters. A worry, according to one source for the Post, was that the toy might "start talking classified." Tiger Electronics, the makers of the Furby, was perplexed.

dinosaur, gimlet, interaction, (16 more...)

The New Yorker

Country: North America > United States > California > San Francisco County > San Francisco (0.04)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.54)
Government > Military (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.53)

Add feedback

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

Zhao, Haiteng, Liu, Shengchao, Ma, Chang, Xu, Hannan, Fu, Jie, Deng, Zhi-Hong, Kong, Lingpeng, Liu, Qi

arXiv.org Artificial IntelligenceOct-22-2023

Molecule property prediction has gained significant attention in recent years. The main bottleneck is the label insufficiency caused by expensive lab experiments. In order to alleviate this issue and to better leverage textual knowledge for tasks, this study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting. We discover that existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs. To overcome these issues, we propose GIMLET, which unifies language models for both graph and text data. By adopting generalized position embedding, our model is extended to encode both graph structures and instruction text without additional graph encoding modules. GIMLET also decouples encoding of the graph from tasks instructions in the attention mechanism, enhancing the generalization of graph features across novel tasks. We construct a dataset consisting of more than two thousand molecule tasks with corresponding instructions derived from task descriptions. We pretrain GIMLET on the molecule tasks along with instructions, enabling the model to transfer effectively to a broad range of tasks. Experimental results demonstrate that GIMLET significantly outperforms molecule-text baselines in instruction-based zero-shot learning, even achieving closed results to supervised GNN models on tasks such as toxcast and muv.

dataset, instruction, language model, (14 more...)

arXiv.org Artificial Intelligence

2306.13089

Country:

North America > United States (0.28)
Asia > China > Hong Kong (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Filters

Collaborating Authors

gimlet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

GIMLET: AUnified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

129033c7c08be683059559e8d6bfd460-Paper-Conference.pdf

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning

MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction

The Lifelike Illusions of A.I.

GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule Zero-Shot Learning