Peeling Context from Cause for Multimodal Molecular Property Prediction
Li, Tao, Hou, Kaiyuan, Vinh, Tuan, Yang, Carl, Raj, Monika
–arXiv.org Artificial Intelligence
Deep models are used for molecular property prediction, yet they are often hard to interpret and may rely on spurious context rather than causal structure, which degrades reliability under distribution shift and harms predictive performance. We introduce CLaP, Causal Layerwise Peeling, a framework which separates causal signal from context in a layerwise manner and integrates diverse graph representations of molecules. At each layer, a causal block performs a soft split into causal and trivial branches, fuses causal evidence across modalities, and progressively peels batch-coupled context to concentrate on label-relevant structure, thereby limiting shortcut signals and stabilizing layerwise refinement. We also obtain atom-level causal saliency maps that highlight substructures responsible for a prediction, providing actionable guidance for targeted molecular edits. Case studies confirm the accuracy of these maps and their alignment with chemical intuition. By peeling context from cause at every layer, the model delivers predictors that are accurate and interpretable for molecular design. Designing molecules with desired properties is a central goal in drug discovery and materials design (Sanchez-Lengeling & Aspuru-Guzik, 2018). Graph-based deep learning is effective for property prediction (Wu et al., 2018; Hinton et al., 2006; Bengio & LeCun, 2007; Goodfellow et al., 2016). However, models often exploit spurious correlations tied to datasets or batches (Geirhos et al., 2020), which hurts reliability under distribution shift.
arXiv.org Artificial Intelligence
Nov-11-2025
- Country:
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Genre:
- Research Report (0.50)
- Industry:
- Technology: