Can Molecular Foundation Models Know What They Don't Know? A Simple Remedy with Preference Optimization

He, Langzhou, Zhu, Junyou, Wang, Fangxin, Liu, Junhua, Xu, Haoyan, Zhao, Yue, Yu, Philip S., Wu, Qitian

Oct-1-2025–arXiv.org Artificial Intelligence

Molecular foundation models are rapidly advancing scientific discovery, but their unreliability on out-of-distribution (OOD) samples severely limits their application in high-stakes domains such as drug discovery and protein design. A critical failure mode is chemical hallucination, where models make high-confidence yet entirely incorrect predictions for unknown molecules. To address this challenge, we introduce Molecular Preference-Aligned Instance Ranking (Mole-PAIR), a simple, plug-and-play module that can be flexibly integrated with existing foundation models to improve their reliability on OOD data through cost-effective post-training. Specifically, our method formulates the OOD detection problem as a preference optimization over the estimated OOD affinity between in-distribution (ID) and OOD samples, achieving this goal through a pairwise learning objective. We show that this objective essentially optimizes AUROC, which measures how consistently ID and OOD samples are ranked by the model. Extensive experiments across five real-world molecular datasets demonstrate that our approach significantly improves the OOD detection capabilities of existing molecular foundation models, achieving up to 45.8%, 43.9%, and 24.3% improvements in AUROC under distribution shifts of size, scaffold, and assay, respectively.

data mining, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:
- Information Technology
  - Data Science > Data Mining (0.93)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language (0.93)
    - Machine Learning
      - Performance Analysis > Accuracy (1.00)
      - Neural Networks (1.00)
      - Statistical Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found