AITopics | molformer

Collaborating Authors

molformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

848350afc07c02edea5be06d3390d865-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 07:02:46 GMT

descriptor, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Superior Molecular Representations from Intermediate Encoder Layers

Pinto, Luis

arXiv.org Artificial IntelligenceOct-16-2025

Pretrained molecular encoders have become indispensable in computational chemistry for tasks such as property prediction and molecular generation. However, the standard practice of relying solely on final-layer embeddings for downstream tasks may discard valuable information. In this work, we first analyze the information flow in five diverse molecular encoders and find that intermediate layers retain more general-purpose features, whereas the final-layer specializes and compresses information. We then perform an empirical layer-wise evaluation across 22 property prediction tasks. We find that using frozen embeddings from optimal intermediate layers improves downstream performance by an average of 5.4%, up to 28.6%, compared to the final-layer. Furthermore, finetuning encoders truncated at intermediate depths achieves even greater average improvements of 8.5%, with increases as high as 40.8%, obtaining new state-of-the-art results on several benchmarks. These findings highlight the importance of exploring the full representational depth of molecular encoders to achieve substantial performance improvements and computational efficiency. The code will be made publicly available.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.06443

Country:

North America > United States > Minnesota (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

Can Transformers Smell Like Humans?

Neural Information Processing SystemsOct-11-2025, 00:28:48 GMT

In this work, we ask the question of whether pre-trained transformer models of chemical structures encode representations that are aligned with human olfactory perception, i.e., can transformers smell like humans?

dataset, descriptor, representation, (17 more...)

Neural Information Processing Systems

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Addressing Model Overcomplexity in Drug-Drug Interaction Prediction With Molecular Fingerprints

Gil-Sorribes, Manel, Molina, Alexis

arXiv.org Artificial IntelligenceMar-30-2025

Accurately predicting drug-drug interactions (DDIs) is crucial for pharmaceutical research and clinical safety. Recent deep learning models often suffer from high computational costs and limited generalization across datasets. In this study, we investigate a simpler yet effective approach using molecular representations such as Morgan fingerprints (MFPS), graph-based embeddings from graph convolutional networks (GCNs), and transformer-derived embeddings from MoLFormer integrated into a straightforward neural network. We benchmark our implementation on DrugBank DDI splits and a drug-drug affinity (DDA) dataset from the Food and Drug Administration. MFPS along with MoLFormer and GCN representations achieve competitive performance across tasks, even in the more challenging leak-proof split, highlighting the sufficiency of simple molecular representations. Moreover, we are able to identify key molecular motifs and structural patterns relevant to drug interactions via gradient-based analyses using the representations under study. Despite these results, dataset limitations such as insufficient chemical diversity, limited dataset size, and inconsistent labeling impact robust evaluation and challenge the need for more complex approaches. Our work provides a meaningful baseline and emphasizes the need for better dataset curation and progressive complexity scaling.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

2503.2355

Country:

North America > United States (0.49)
Europe > Spain (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Can Transformers Smell Like Humans?

Taleb, Farzaneh, Vasco, Miguel, Ribeiro, Antônio H., Björkman, Mårten, Kragic, Danica

arXiv.org Artificial IntelligenceNov-5-2024

Despite recent advances in understanding visual and auditory perception, olfactory perception remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels of human olfactory perception. In this work, we ask the question of whether pre-trained transformer models of chemical structures encode representations that are aligned with human olfactory perception, i.e., can transformers smell like humans? We demonstrate that representations encoded from transformers pre-trained on general chemical structures are highly aligned with human olfactory perception. We use multiple datasets and different types of perceptual representations to show that the representations encoded by transformer models are able to predict: (i) labels associated with odorants provided by experts; (ii) continuous ratings provided by human participants with respect to pre-defined descriptors; and (iii) similarity ratings between odorants provided by human participants. Finally, we evaluate the extent to which this alignment is associated with physicochemical features of odorants known to be relevant for olfactory decoding.

descriptor, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.03038

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pre-trained Molecular Language Models with Random Functional Group Masking

Peng, Tianhao, Li, Yuchen, Li, Xuhong, Bian, Jiang, Xie, Zeke, Sui, Ning, Mumtaz, Shahid, Xu, Yanwu, Kong, Linghe, Xiong, Haoyi

arXiv.org Artificial IntelligenceNov-2-2024

Recent advancements in computational chemistry have leveraged the power of trans-former-based language models, such as MoLFormer, pre-trained using a vast amount of simplified molecular-input line-entry system (SMILES) sequences, to understand and predict molecular properties and activities, a critical step in fields like drug discovery and materials science. To further improve performance, researchers have introduced graph neural networks with graph-based molecular representations, such as GEM, incorporating the topology, geometry, 2D or even 3D structures of molecules into pre-training. While most of molecular graphs in existing studies were automatically converted from SMILES sequences, it is to assume that transformer-based language models might be able to implicitly learn structure-aware representations from SMILES sequences. In this paper, we propose \ours{} -- a SMILES-based \underline{\em M}olecular \underline{\em L}anguage \underline{\em M}odel, which randomly masking SMILES subsequences corresponding to specific molecular \underline{\em F}unctional \underline{\em G}roups to incorporate structure information of atoms during the pre-training phase. This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities. Extensive experimental evaluations across 11 benchmark classification and regression tasks in the chemical domain demonstrate the robustness and superiority of \ours{}. Our findings reveal that \ours{} outperforms existing pre-training models, either based on SMILES or graphs, in 9 out of the 11 downstream tasks, ranking as a close second in the remaining ones.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.01401

Country:

North America > United States (0.93)
Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design

Park, Nathaniel H., Callahan, Tiffany J., Hedrick, James L., Erdmann, Tim, Capponi, Sara

arXiv.org Artificial IntelligenceAug-21-2024

Molecular property prediction and generative design via deep learning models has been the subject of intense research given its potential to accelerate development of new, high-performance materials. More recently, these workflows have been significantly augmented with the advent of large language models (LLMs) and systems of LLM-driven agents capable of utilizing pre-trained models to make predictions in the context of more complex research tasks. While effective, there is still room for substantial improvement within the agentic systems on the retrieval of salient information for material design tasks. Moreover, alternative uses of predictive deep learning models, such as leveraging their latent representations to facilitate cross-modal retrieval augmented generation within agentic systems to enable task-specific materials design, has remained unexplored. Herein, we demonstrate that large, pre-trained chemistry foundation models can serve as a basis for enabling semantic chemistry information retrieval for both small-molecules, complex polymeric materials, and reactions. Additionally, we show the use of chemistry foundation models in conjunction with image models such as OpenCLIP facilitate unprecedented queries and information retrieval across multiple characterization data domains. Finally, we demonstrate the integration of these systems within multi-agent systems to facilitate structure and topological-based natural language queries and information retrieval for complex research tasks.

compound, query, similarity, (16 more...)

arXiv.org Artificial Intelligence

2408.11793

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Workflow (1.00)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
Materials > Chemicals > Specialty Chemicals (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

COEFF-KANs: A Paradigm to Address the Electrolyte Field with KANs

Li, Xinhe, Feng, Zhuoying, Chen, Yezeng, Dai, Weichen, He, Zixu, Zhou, Yi, Jiao, Shuhong

arXiv.org Artificial IntelligenceJul-24-2024

To reduce the experimental validation workload for chemical researchers and accelerate the design and optimization of high-energy-density lithium metal batteries, we aim to leverage models to automatically predict Coulombic Efficiency (CE) based on the composition of liquid electrolytes. There are mainly two representative paradigms in existing methods: machine learning and deep learning. However, the former requires intelligent input feature selection and reliable computational methods, leading to error propagation from feature estimation to model prediction, while the latter (e.g. MultiModal-MoLFormer) faces challenges of poor predictive performance and overfitting due to limited diversity in augmented data. To tackle these issues, we propose a novel method COEFF (COlumbic EFficiency prediction via Fine-tuned models), which consists of two stages: pre-training a chemical general model and fine-tuning on downstream domain data. Firstly, we adopt the publicly available MoLFormer model to obtain feature vectors for each solvent and salt in the electrolyte. Then, we perform a weighted average of embeddings for each token across all molecules, with weights determined by the respective electrolyte component ratios. Finally, we input the obtained electrolyte features into a Multi-layer Perceptron or Kolmogorov-Arnold Network to predict CE. Experimental results on a real-world dataset demonstrate that our method achieves SOTA for predicting CE compared to all baselines. Data and code used in this work will be made publicly available after the paper is published.

electrolyte, molformer, representation, (16 more...)

arXiv.org Artificial Intelligence

2407.20265

Country:

South America > Brazil (0.04)
Asia > China (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large language models, physics-based modeling, experimental measurements: the trinity of data-scarce learning of polymer properties

Liu, Ning, Jafarzadeh, Siavash, Lattimer, Brian Y., Ni, Shuna, Lua, Jim, Yu, Yue

arXiv.org Artificial IntelligenceJul-2-2024

Their vast number of trainable parameters necessitates a wealth of data to achieve accuracy and mitigate overfitting. However, experimental measurements are often limited and costly to obtain in sufficient quantities for finetuning. To this end, we present a physics-based training pipeline that tackles the pathology of data scarcity. The core enabler is a physics-based modeling framework that generates a multitude of synthetic data to align the LLM to a physically consistent initial state before finetuning. Our framework features a two-phase training strategy: (1) utilizing the large-in-amount while less accurate synthetic data for supervised pretraining, and (2) finetuning the phase-1 model with limited experimental data. We empirically demonstrate that supervised pretraining is vital to obtaining accurate finetuned LLMs, via the lens of learning polymer flammability metrics where cone calorimeter data is sparse.

large language model, machine learning, polymer, (20 more...)

arXiv.org Artificial Intelligence

2407.0277

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.82)

Industry:

Materials > Chemicals (0.68)
Energy > Oil & Gas (0.68)
Health & Medicine (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions

Wu, Fang, Li, Siyuan, Wu, Lirong, Radev, Dragomir, Li, Stan Z.

arXiv.org Artificial IntelligenceJan-8-2023

Graph neural networks (GNNs) mainly rely on the message-passing paradigm to propagate node features and build interactions, and different graph learning tasks require different ranges of node interactions. In this work, we explore the capacity of GNNs to capture interactions between nodes under contexts with different complexities. We discover that GNNs are usually unable to capture the most informative kinds of interaction styles for diverse graph learning tasks, and thus name this phenomenon as GNNs' representation bottleneck. As a response, we demonstrate that the inductive bias introduced by existing graph construction mechanisms can prevent GNNs from learning interactions of the most appropriate complexity, i.e., resulting in the representation bottleneck. To address that limitation, we propose a novel graph rewiring approach based on interaction patterns learned by GNNs to adjust the receptive fields of each node dynamically. Extensive experiments on both real-world and synthetic datasets prove the effectiveness of our algorithm to alleviate the representation bottleneck and its superiority to enhance the performance of GNNs over state-of-the-art graph rewiring baselines.

artificial intelligence, interaction, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2205.07266

Country:

Europe > Italy > Marche > Ancona Province > Ancona (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback