slice thickness
Longitudinal Vestibular Schwannoma Dataset with Consensus-based Human-in-the-loop Annotations
Wijethilake, Navodini, Ivory, Marina, MacCormac, Oscar, Kumar, Siddhant, Kujawa, Aaron, Macias, Lorena Garcia-Foncillas, Burger, Rebecca, Hitchings, Amanda, Thomson, Suki, Barazi, Sinan, Maratos, Eleni, Obholzer, Rupert, Jiang, Dan, McClenaghan, Fiona, Chia, Kazumi, Al-Salihi, Omar, Thomas, Nick, Connor, Steve, Vercauteren, Tom, Shapey, Jonathan
Accurate segmentation of vestibular schwannoma (VS) on Magnetic Resonance Imaging (MRI) is essential for patient management but often requires time-intensive manual annotations by experts. While recent advances in deep learning (DL) have facilitated automated segmentation, challenges remain in achieving robust performance across diverse datasets and complex clinical cases. We present an annotated dataset stemming from a bootstrapped DL-based framework for iterative segmentation and quality refinement of VS in MRI. We combine data from multiple centres and rely on expert consensus for trustworthiness of the annotations. We show that our approach enables effective and resource-efficient generalisation of automated segmentation models to a target data distribution. The framework achieved a significant improvement in segmentation accuracy with a Dice Similarity Coefficient (DSC) increase from 0.9125 to 0.9670 on our target internal validation dataset, while maintaining stable performance on representative external datasets. Expert evaluation on 143 scans further highlighted areas for model refinement, revealing nuanced cases where segmentation required expert intervention. The proposed approach is estimated to enhance efficiency by approximately 37.4% compared to the conventional manual annotation process. Overall, our human-in-the-loop model training approach achieved high segmentation accuracy, highlighting its potential as a clinically adaptable and generalisable strategy for automated VS segmentation in diverse clinical settings. The dataset includes 190 patients, with tumour annotations available for 534 longitudinal contrast-enhanced T1-weighted (T1CE) scans from 184 patients, and non-annotated T2-weighted scans from 6 patients. This dataset is publicly accessible on The Cancer Imaging Archive (TCIA) (https://doi.org/10.7937/bq0z-xa62).
MedSR-Impact: Transformer-Based Super-Resolution for Lung CT Segmentation, Radiomics, Classification, and Prognosis
Martell, Marc Boubnovski, Linton-Reid, Kristofer, Chen, Mitchell, Hindocha, Sumeet, Hunter, Benjamin, Calzado, Marco A., Lee, Richard, Posma, Joram M., Aboagye, Eric O.
High-resolution volumetric computed tomography (CT) is essential for accurate diagnosis and treatment planning in thoracic diseases; however, it is limited by radiation dose and hardware costs. We present the Transformer Volumetric Super-Resolution Network (\textbf{TVSRN-V2}), a transformer-based super-resolution (SR) framework designed for practical deployment in clinical lung CT analysis. Built from scalable components, including Through-Plane Attention Blocks (TAB) and Swin Transformer V2 -- our model effectively reconstructs fine anatomical details in low-dose CT volumes and integrates seamlessly with downstream analysis pipelines. We evaluate its effectiveness on three critical lung cancer tasks -- lobe segmentation, radiomics, and prognosis -- across multiple clinical cohorts. To enhance robustness across variable acquisition protocols, we introduce pseudo-low-resolution augmentation, simulating scanner diversity without requiring private data. TVSRN-V2 demonstrates a significant improvement in segmentation accuracy (+4\% Dice), higher radiomic feature reproducibility, and enhanced predictive performance (+0.06 C-index and AUC). These results indicate that SR-driven recovery of structural detail significantly enhances clinical decision support, positioning TVSRN-V2 as a well-engineered, clinically viable system for dose-efficient imaging and quantitative analysis in real-world CT workflows.
3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography
Zhu, Weicheng, Huang, Haoxu, Tang, Huanze, Musthyala, Rushabh, Yu, Boyang, Chen, Long, Vega, Emilio, O'Donnell, Thomas, Dehkharghani, Seena, Frontera, Jennifer A., Masurkar, Arjun V., Melmed, Kara, Razavian, Narges
Head computed tomography (CT) imaging is a widely-used imaging modality with multitudes of medical indications, particularly in assessing pathology of the brain, skull, and cerebrovascular system. It is commonly the first-line imaging in neurologic emergencies given its rapidity of image acquisition, safety, cost, and ubiquity. Deep learning models may facilitate detection of a wide range of diseases. However, the scarcity of high-quality labels and annotations, particularly among less common conditions, significantly hinders the development of powerful models. To address this challenge, we introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning. Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations, enabling the model to learn robust, generalizable features. To investigate the potential of self-supervised learning in head CT, we employed both discrimination with self-distillation and masked image modeling, and we construct our model in 3D rather than at the slice level (2D) to exploit the structure of head CT scans more comprehensively and efficiently. The model's downstream classification performance is evaluated using internal and three external datasets, encompassing both in-distribution (ID) and out-of-distribution (OOD) data. Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks compared to models trained from scratch and previous 3D CT foundation models on scarce annotated datasets. This work highlights the effectiveness of self-supervised learning in medical imaging and sets a new benchmark for head CT image analysis in 3D, enabling broader use of artificial intelligence for head CT-based diagnosis.
Exploring the Feasibility of AI-Assisted Spine MRI Protocol Optimization Using DICOM Image Metadata
Vian, Alice, Eifer, Diego Andre, Anes, Mauricio, Garcia, Guilherme Ribeiro, Recamonde-Mendoza, Mariana
Artificial intelligence (AI) is increasingly being utilized to optimize magnetic resonance imaging (MRI) protocols. Given that image details are critical for diagnostic accuracy, optimizing MRI acquisition protocols is essential for enhancing image quality. While medical physicists are responsible for this optimization, the variability in equipment usage and the wide range of MRI protocols in clinical settings pose significant challenges. This study aims to validate the application of AI in optimizing MRI protocols using dynamic data from clinical practice, specifically DICOM metadata. To achieve this, four MRI spine exam databases were created, with the target attribute being the binary classification of image quality (good or bad). Five AI models were trained to identify trends in acquisition parameters that influence image quality, grounded in MRI theory. These trends were analyzed using SHAP graphs. The models achieved F1 performance ranging from 77% to 93% for datasets containing 292 or more instances, with the observed trends aligning with MRI theory. The models effectively reflected the practical realities of clinical MRI settings, offering a valuable tool for medical physicists in quality control tasks. In conclusion, AI has demonstrated its potential to optimize MRI protocols, supporting medical physicists in improving image quality and enhancing the efficiency of quality control in clinical practice.
Automatic Tongue Delineation from MRI Images with a Convolutional Neural Network Approach
Isaieva, Karyna, Laprie, Yves, Turpault, Nicolas, Houssard, Alexis, Felblinger, Jacques, Vuissoz, Pierre-André
Tongue contour extraction from real-time magnetic resonance images is a nontrivial task due to the presence of artifacts manifesting in form of blurring or ghostly contours. In this work, we present results of automatic tongue delineation achieved by means of U-Net auto-encoder convolutional neural network. We present both intra- and inter-subject validation. We used real-time magnetic resonance images and manually annotated 1-pixel wide contours as inputs. Predicted probability maps were post-processed in order to obtain 1-pixel wide tongue contours. The results are very good and slightly outperform published results on automatic tongue segmentation.
Towards Non-invasive and Personalized Management of Breast Cancer Patients from Multiparametric MRI via A Large Mixture-of-Modality-Experts Model
Luo, Luyang, Wu, Mingxiang, Li, Mei, Xin, Yi, Wang, Qiong, Vardhanabhuti, Varut, Chu, Winnie CW, Li, Zhenhui, Zhou, Juan, Rajpurkar, Pranav, Chen, Hao
Breast magnetic resonance imaging (MRI) is the imaging technique with the highest sensitivity for detecting breast cancer and is routinely used for women at high risk. Despite the comprehensive multiparametric protocol of breast MRI, existing artificial intelligence-based studies predominantly rely on single sequences and have limited validation. Here we report a large mixture-of-modality-experts model (MOME) that integrates multiparametric MRI information within a unified structure, offering a noninvasive method for personalized breast cancer management. We have curated the largest multiparametric breast MRI dataset, involving 5,205 patients from three hospitals in the north, southeast, and southwest of China, for the development and extensive evaluation of our model. MOME demonstrated accurate and robust identification of breast cancer. It achieved comparable performance for malignancy recognition to that of four senior radiologists and significantly outperformed a junior radiologist, with 0.913 AUROC, 0.948 AUPRC, 0.905 F1 score, and 0.723 MCC. Our findings suggest that MOME could reduce the need for biopsies in BI-RADS 4 patients with a ratio of 7.3%, classify triple-negative breast cancer with an AUROC of 0.709, and predict pathological complete response to neoadjuvant chemotherapy with an AUROC of 0.694. The model further supports scalable and interpretable inference, adapting to missing modalities and providing decision explanations by highlighting lesions and measuring modality contributions. MOME exemplifies a discriminative, robust, scalable, and interpretable multimodal model, paving the way for noninvasive, personalized management of breast cancer patients based on multiparametric breast imaging data.
Synthetic Data for Robust Stroke Segmentation
Chalcroft, Liam, Pappas, Ioannis, Price, Cathy J., Ashburner, John
Deep learning-based semantic segmentation in neuroimaging currently requires high-resolution scans and extensive annotated datasets, posing significant barriers to clinical applicability. We present a novel synthetic framework for the task of lesion segmentation, extending the capabilities of the established SynthSeg approach to accommodate large heterogeneous pathologies with lesion-specific augmentation strategies. Our method trains deep learning models, demonstrated here with the UNet architecture, using label maps derived from healthy and stroke datasets, facilitating the segmentation of both healthy tissue and pathological lesions without sequence-specific training data. Evaluated against in-domain and out-of-domain (OOD) datasets, our framework demonstrates robust performance, rivaling current methods within the training domain and significantly outperforming them on OOD data. This contribution holds promise for advancing medical imaging analysis in clinical settings, especially for stroke pathology, by enabling reliable segmentation across varied imaging sequences with reduced dependency on large annotated corpora.
Enhancing Super-Resolution Networks through Realistic Thick-Slice CT Simulation
Tang, Zeyu, Xing, Xiaodan, Yang, Guang
This study aims to develop and evaluate an innovative simulation algorithm for generating thick-slice CT images that closely resemble actual images in the AAPM-Mayo's 2016 Low Dose CT Grand Challenge dataset. The proposed method was evaluated using Peak Signal-to-Noise Ratio (PSNR) and Root Mean Square Error (RMSE) metrics, with the hypothesis that our simulation would produce images more congruent with their real counterparts. Our proposed method demonstrated substantial enhancements in terms of both PSNR and RMSE over other simulation methods. The highest PSNR values were obtained with the proposed method, yielding 49.7369 $\pm$ 2.5223 and 48.5801 $\pm$ 7.3271 for D45 and B30 reconstruction kernels, respectively. The proposed method also registered the lowest RMSE with values of 0.0068 $\pm$ 0.0020 and 0.0108 $\pm$ 0.0099 for D45 and B30, respectively, indicating a distribution more closely aligned with the authentic thick-slice image. Further validation of the proposed simulation algorithm was conducted using the TCIA LDCT-and-Projection-data dataset. The generated images were then leveraged to train four distinct super-resolution (SR) models, which were subsequently evaluated using the real thick-slice images from the 2016 Low Dose CT Grand Challenge dataset. When trained with data produced by our novel algorithm, all four SR models exhibited enhanced performance.
SuperMask: Generating High-resolution object masks from multi-view, unaligned low-resolution MRIs
Gu, Hanxue, He, Hongyu, Colglazier, Roy, Axelrod, Jordan, French, Robert, Mazurowski, Maciej A
Three-dimensional segmentation in magnetic resonance images (MRI), which reflects the true shape of the objects, is challenging since high-resolution isotropic MRIs are rare and typical MRIs are anisotropic, with the out-of-plane dimension having a much lower resolution. A potential remedy to this issue lies in the fact that often multiple sequences are acquired on different planes. However, in practice, these sequences are not orthogonal to each other, limiting the applicability of many previous solutions to reconstruct higher-resolution images from multiple lower-resolution ones. We propose a weakly-supervised deep learning-based solution to generating high-resolution masks from multiple low-resolution images. Our method combines segmentation and unsupervised registration networks by introducing two new regularizations to make registration and segmentation reinforce each other. Finally, we introduce a multi-view fusion method to generate high-resolution target object masks. The experimental results on two datasets show the superiority of our methods. Importantly, the advantage of not using high-resolution images in the training process makes our method applicable to a wide variety of MRI segmentation tasks.
Deep Learning Body Region Classification of MRI and CT examinations
Raffy, Philippe, Pambrun, Jean-François, Kumar, Ashish, Dubois, David, Patti, Jay Waldron, Cairns, Robyn Alexandra, Young, Ryan
Standardized body region labelling of individual images provides data that can improve human and computer use of medical images. A CNN-based classifier was developed to identify body regions in CT and MRI. 17 CT (18 MRI) body regions covering the entire human body were defined for the classification task. Three retrospective databases were built for the AI model training, validation, and testing, with a balanced distribution of studies per body region. The test databases originated from a different healthcare network. Accuracy, recall and precision of the classifier was evaluated for patient age, patient gender, institution, scanner manufacturer, contrast, slice thickness, MRI sequence, and CT kernel. The data included a retrospective cohort of 2,934 anonymized CT cases (training: 1,804 studies, validation: 602 studies, test: 528 studies) and 3,185 anonymized MRI cases (training: 1,911 studies, validation: 636 studies, test: 638 studies). 27 institutions from primary care hospitals, community hospitals and imaging centers contributed to the test datasets. The data included cases of all genders in equal proportions and subjects aged from a few months old to +90 years old. An image-level prediction accuracy of 91.9% (90.2 - 92.1) for CT, and 94.2% (92.0 - 95.6) for MRI was achieved. The classification results were robust across all body regions and confounding factors. Due to limited data, performance results for subjects under 10 years-old could not be reliably evaluated. We show that deep learning models can classify CT and MRI images by body region including lower and upper extremities with high accuracy.