Exploring Molecular Pretraining Model at Scale

May-29-2025, 13:37:29 GMT–Neural Information Processing Systems

In recent years, pretraining models have made significant advancements in the fields of natural language processing (NLP), computer vision (CV), and life sciences. The significant advancements in NLP and CV are predominantly driven by the expansion of model parameters and data size, a phenomenon now recognized as the scaling laws. However, research exploring scaling law in molecular pretraining model remains unexplored. In this work, we present an innovative molecular pretraining model that leverages a two-track transformer to effectively integrate features at the atomic level, graph level, and geometry structure level. Along with this, we systematically investigate the scaling law within molecular pretraining models, examining the power-law correlations between validation loss and model size, dataset size, and computational resources. Extensive experiments show the consistent improvement on the downstream tasks as the model size grows up.

artificial intelligence, exploring molecular pretraining model, natural language, (2 more...)

Neural Information Processing Systems

May-29-2025, 13:37:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language (1.00)