CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening
Zhou, Gen, Janarthanan, Sugitha, Lu, Yutong, Hu, Pingzhao
–arXiv.org Artificial Intelligence
Due to the rise in antimicrobial resistance, identifying novel compounds with antibiotic potential is crucial for combatting this global health issue. However, traditional drug development methods are costly and inefficient. Recognizing the pressing need for more effective solutions, researchers have turned to machine learning techniques to streamline the prediction and development of novel antibiotic compounds. While foundation models have shown promise in antibiotic discovery, current mainstream efforts still fall short of fully leveraging the potential of multimodal molecular data. Recent studies suggest that contrastive learning frameworks utilizing multimodal data exhibit excellent performance in representation learning across various domains. Building upon this, we introduce CL-MFAP, an unsupervised contrastive learning (CL)-based multimodal foundation (MF) model specifically tailored for discovering small molecules with potential antibiotic properties (AP) using three types of molecular data. This model employs 1.6 million bioactive molecules with drug-like properties from the ChEMBL dataset to jointly pretrain three encoders: (1) a transformer-based encoder with rotary position embedding for processing SMILES strings; (2) another transformerbased encoder, incorporating a novel bi-level routing attention mechanism to handle molecular graph representations; and (3) a Morgan fingerprint encoder using a multilayer perceptron, to achieve the contrastive learning purpose. The CL-MFAP outperforms baseline models in antibiotic property prediction by effectively utilizing different molecular modalities and demonstrates superior domain-specific performance when fine-tuned for antibiotic-related property prediction tasks. Bacteria play a pivotal role in a diverse array of diseases within the human body, serving as either the primary cause or a contributing factor. A promising and sometimes sole treatment for these diseases is antibiotics, a specialized class of drugs designed to target pathogenic bacteria. Despite advancements, a lack of antibiotics for many pathogenic bacteria persists, and antibiotic resistance allows bacteria to survive once effective treatments. Consequently, there is a pressing demand for the continual development of antibiotics.
arXiv.org Artificial Intelligence
Feb-16-2025
- Country:
- North America
- United States (0.68)
- Canada > Ontario (0.28)
- North America
- Genre:
- Research Report > New Finding (1.00)
- Technology: