Uni-Mol3: A Multi-Molecular Foundation Model for Advancing Organic Reaction Modeling
Wu, Lirong, Wang, Junjie, Gao, Zhifeng, Ji, Xiaohong, Zhu, Rong, Li, Xinyu, Zhang, Linfeng, Ke, Guolin, E, Weinan
–arXiv.org Artificial Intelligence
Organic reaction, the foundation of modern chemical industry, is crucial for new material development and drug discovery. However, deciphering reaction mechanisms and modeling multi-molecular relationships remain formidable challenges due to the complexity of molecular dynamics. While several state-of-the-art models like Uni-Mol2 have revolutionized single-molecular representation learning, their extension to multi-molecular systems, where chemical reactions inherently occur, has been underexplored. This paper introduces Uni-Mol3, a novel deep learning framework that employs a hierarchical pipeline for multi-molecular reaction modeling. At its core, Uni-Mol3 adopts a multi-scale molecular tokenizer (Mol-Tokenizer) that encodes 3D structures of molecules and other features into discrete tokens, creating a 3D-aware molecular language. The framework innovatively combines two pre-training stages: molecular pre-training to learn the molecular grammars and reaction pre-training to capture fundamental reaction principles, forming a progressive learning paradigm from single- to multi-molecular systems. With prompt-aware downstream fine-tuning, Uni-Mol3 demonstrates exceptional performance in diverse organic reaction tasks and supports multi-task prediction with strong generalizability. Experimental results across 10 datasets spanning 4 downstream tasks show that Uni-Mol3 outperforms existing methods, validating its effectiveness in modeling complex organic reactions. This work not only ushers in an alternative paradigm for multi-molecular computational modeling but also charts a course for intelligent organic reaction by bridging molecular representation with reaction mechanism understanding.
arXiv.org Artificial Intelligence
Aug-13-2025
- Country:
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.88)
- Materials > Chemicals (1.00)
- Technology: