A Fast and Efficient Modern BERT based Text-Conditioned Diffusion Model for Medical Image Segmentation

Dhara, Venkata Siddharth, Kumar, Pawan

arXiv.org Artificial Intelligence 

In recent times, denoising diffusion probabilistic models (DPMs) have proven to show significant success in medical image generation and denoising, while also serving as powerful representation learners for downstream tasks such as segmentation. However, their effectiveness in segmentation is limited by the need for detailed pixel-wise annotations, which are expensive, time-consuming, and require expert knowledge--a significant bottleneck in real-world clinical applications. In order to mitigate this limitation of label-efficiency, we propose a fast and efficient model named FastTextDiff, a diffusion-based segmentation model that integrates medical text annotations to enhance semantic representations. Our approach leverages ModernBERT [3], a transformer-based language model capable of processing long medical text sequences, to establish a strong connection between textual annotations and semantic meaning in medical imaging. ModernBERT can efficiently encode clinical knowledge for directing segmentation tasks since it has been trained on both MIMIC-III [19] and MIMIC-IV. Label-efficient segmentation with enhanced performance is made possible by cross-modal attention processes, which enable smooth interaction between visual and textual modalities. This study validates ModernBERT as a quick and scalable substitute for Clinical BioBERT [4] in diffusion-based segmentation pipelines [2] and demonstrates the promise of multi-modal techniques for medical image analysis.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found