MMLNB: Multi-Modal Learning for Neuroblastoma Subtyping Classification Assisted with Textual Description Generation
Chen, Huangwei, Chen, Yifei, Yan, Zhenyu, Ding, Mingyang, Li, Chenlei, Zhu, Zhu, Qin, Feiwei
–arXiv.org Artificial Intelligence
Neuroblastoma (NB), a leading cause of childhood cancer mortality, exhibits significant histopathological variability, necessitating precise subtyping for accurate prognosis and treatment. Traditional diagnostic methods rely on subjective evaluations that are time-consuming and inconsistent. To address these challenges, we introduce MMLNB, a multi-modal learning (MML) model that integrates pathological images with generated textual descriptions to improve classification accuracy and interpretability. The approach follows a two-stage process. First, we fine-tune a Vision-Language Model (VLM) to enhance pathology-aware text generation. Second, the fine-tuned VLM generates textual descriptions, using a dual-branch architecture to independently extract visual and textual features. These features are fused via Progressive Robust Multi-Modal Fusion (PRMF) Block for stable training. Experimental results show that the MMLNB model is more accurate than the single modal model. Ablation studies demonstrate the importance of multi-modal fusion, fine-tuning, and the PRMF mechanism. This research creates a scalable AI-driven framework for digital pathology, enhancing reliability and interpretability in NB subtyping classification. Our source code is available at https://github.com/HovChen/MMLNB.
arXiv.org Artificial Intelligence
Mar-19-2025
- Country:
- Asia > China > Zhejiang Province (0.29)
- Genre:
- Research Report
- Experimental Study (0.68)
- New Finding (0.88)
- Research Report
- Industry:
- Health & Medicine > Therapeutic Area > Oncology > Childhood Cancer (1.00)
- Technology: