Peeling Context from Cause for Multimodal Molecular Property Prediction

Li, Tao, Hou, Kaiyuan, Vinh, Tuan, Yang, Carl, Raj, Monika

Nov-11-2025–arXiv.org Artificial Intelligence

Deep models are used for molecular property prediction, yet they are often hard to interpret and may rely on spurious context rather than causal structure, which degrades reliability under distribution shift and harms predictive performance. We introduce CLaP, Causal Layerwise Peeling, a framework which separates causal signal from context in a layerwise manner and integrates diverse graph representations of molecules. At each layer, a causal block performs a soft split into causal and trivial branches, fuses causal evidence across modalities, and progressively peels batch-coupled context to concentrate on label-relevant structure, thereby limiting shortcut signals and stabilizing layerwise refinement. We also obtain atom-level causal saliency maps that highlight substructures responsible for a prediction, providing actionable guidance for targeted molecular edits. Case studies confirm the accuracy of these maps and their alignment with chemical intuition. By peeling context from cause at every layer, the model delivers predictors that are accurate and interpretable for molecular design. Designing molecules with desired properties is a central goal in drug discovery and materials design (Sanchez-Lengeling & Aspuru-Guzik, 2018). Graph-based deep learning is effective for property prediction (Wu et al., 2018; Hinton et al., 2006; Bengio & LeCun, 2007; Goodfellow et al., 2016). However, models often exploit spurious correlations tied to datasets or batches (Geirhos et al., 2020), which hurts reliability under distribution shift.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Nov-11-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (0.89)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found