Parida, Abhijeet
Data Alchemy: Mitigating Cross-Site Model Variability Through Test Time Data Calibration
Parida, Abhijeet, Alomar, Antonia, Jiang, Zhifan, Roshanitabrizi, Pooneh, Tapp, Austin, Ledesma-Carbayo, Maria, Xu, Ziyue, Anwar, Syed Muhammed, Linguraru, Marius George, Roth, Holger R.
Deploying deep learning-based imaging tools across various clinical sites poses significant challenges due to inherent domain shifts and regulatory hurdles associated with site-specific fine-tuning. For histopathology, stain normalization techniques can mitigate discrepancies, but they often fall short of eliminating inter-site variations. Therefore, we present Data Alchemy, an explainable stain normalization method combined with test time data calibration via a template learning framework to overcome barriers in cross-site analysis. Data Alchemy handles shifts inherent to multi-site data and minimizes them without needing to change the weights of the normalization or classifier networks. Our approach extends to unseen sites in various clinical settings where data domain discrepancies are unknown. Extensive experiments highlight the efficacy of our framework in tumor classification in hematoxylin and eosin-stained patches. Our explainable normalization method boosts classification tasks' area under the precision-recall curve (AUPR) by 0.165, 0.545 to 0.710. Additionally, Data Alchemy further reduces the multisite classification domain gap, by improving the 0.710 AUPR an additional 0.142, elevating classification performance further to 0.852, from 0.545. Our Data Alchemy framework can popularize precision medicine with minimal operational overhead by allowing for the seamless integration of pre-trained deep learning-based clinical tools across multiple sites.
Harvesting Private Medical Images in Federated Learning Systems with Crafted Models
Shi, Shanghao, Haque, Md Shahedul, Parida, Abhijeet, Linguraru, Marius George, Hou, Y. Thomas, Anwar, Syed Muhammad, Lou, Wenjing
Federated learning (FL) allows a set of clients to collaboratively train a machine-learning model without exposing local training samples. In this context, it is considered to be privacy-preserving and hence has been adopted by medical centers to train machine-learning models over private data. However, in this paper, we propose a novel attack named MediLeak that enables a malicious parameter server to recover high-fidelity patient images from the model updates uploaded by the clients. MediLeak requires the server to generate an adversarial model by adding a crafted module in front of the original model architecture. It is published to the clients in the regular FL training process and each client conducts local training on it to generate corresponding model updates. Then, based on the FL protocol, the model updates are sent back to the server and our proposed analytical method recovers private data from the parameter updates of the crafted module. We provide a comprehensive analysis for MediLeak and show that it can successfully break the state-of-the-art cryptographic secure aggregation protocols, designed to protect the FL systems from privacy inference attacks. We implement MediLeak on the MedMNIST and COVIDx CXR-4 datasets. The results show that MediLeak can nearly perfectly recover private images with high recovery rates and quantitative scores. We further perform downstream tasks such as disease classification with the recovered data, where our results show no significant performance degradation compared to using the original training samples.
D-Rax: Domain-specific Radiologic assistant leveraging multi-modal data and eXpert model predictions
Nisar, Hareem, Anwar, Syed Muhammad, Jiang, Zhifan, Parida, Abhijeet, Nath, Vishwesh, Roth, Holger R., Linguraru, Marius George
Large vision language models (VLMs) have progressed incredibly from research to applicability for general-purpose use cases. LLaVA-Med, a pioneering large language and vision assistant for biomedicine, can perform multi-modal biomedical image and data analysis to provide a natural language interface for radiologists. While it is highly generalizable and works with multi-modal data, it is currently limited by well-known challenges that exist in the large language model space. Hallucinations and imprecision in responses can lead to misdiagnosis which currently hinder the clinical adaptability of VLMs. To create precise, user-friendly models in healthcare, we propose D-Rax -- a domain-specific, conversational, radiologic assistance tool that can be used to gain insights about a particular radiologic image. In this study, we enhance the conversational analysis of chest X-ray (CXR) images to support radiological reporting, offering comprehensive insights from medical imaging and aiding in the formulation of accurate diagnosis. D-Rax is achieved by fine-tuning the LLaVA-Med architecture on our curated enhanced instruction-following data, comprising of images, instructions, as well as disease diagnosis and demographic predictions derived from MIMIC-CXR imaging data, CXR-related visual question answer (VQA) pairs, and predictive outcomes from multiple expert AI models. We observe statistically significant improvement in responses when evaluated for both open and close-ended conversations. Leveraging the power of state-of-the-art diagnostic models combined with VLMs, D-Rax empowers clinicians to interact with medical images using natural language, which could potentially streamline their decision-making process, enhance diagnostic accuracy, and conserve their time.
Lung-CADex: Fully automatic Zero-Shot Detection and Classification of Lung Nodules in Thoracic CT Images
Shaukat, Furqan, Anwar, Syed Muhammad, Parida, Abhijeet, Lam, Van Khanh, Linguraru, Marius George, Shah, Mubarak
Lung cancer has been one of the major threats to human life for decades. Computer-aided diagnosis can help with early lung nodul detection and facilitate subsequent nodule characterization. Large Visual Language models (VLMs) have been found effective for multiple downstream medical tasks that rely on both imaging and text data. However, lesion level detection and subsequent diagnosis using VLMs have not been explored yet. We propose CADe, for segmenting lung nodules in a zero-shot manner using a variant of the Segment Anything Model called MedSAM. CADe trains on a prompt suite on input computed tomography (CT) scans by using the CLIP text encoder through prefix tuning. We also propose, CADx, a method for the nodule characterization as benign/malignant by making a gallery of radiomic features and aligning image-feature pairs through contrastive learning. Training and validation of CADe and CADx have been done using one of the largest publicly available datasets, called LIDC. To check the generalization ability of the model, it is also evaluated on a challenging dataset, LUNGx. Our experimental results show that the proposed methods achieve a sensitivity of 0.86 compared to 0.76 that of other fully supervised methods.The source code, datasets and pre-processed data can be accessed using the link:
Harmonization Across Imaging Locations(HAIL): One-Shot Learning for Brain MRI
Parida, Abhijeet, Jiang, Zhifan, Anwar, Syed Muhammad, Foreman, Nicholas, Stence, Nicholas, Fisher, Michael J., Packer, Roger J., Avery, Robert A., Linguraru, Marius George
For machine learning-based prognosis and diagnosis of rare diseases, such as pediatric brain tumors, it is necessary to gather medical imaging data from multiple clinical sites that may use different devices and protocols. Deep learning-driven harmonization of radiologic images relies on generative adversarial networks (GANs). However, GANs notoriously generate pseudo structures that do not exist in the original training data, a phenomenon known as "hallucination". To prevent hallucination in medical imaging, such as magnetic resonance images (MRI) of the brain, we propose a one-shot learning method where we utilize neural style transfer for harmonization. At test time, the method uses one image from a clinical site to generate an image that matches the intensity scale of the collaborating sites. Our approach combines learning a feature extractor, neural style transfer, and adaptive instance normalization. We further propose a novel strategy to evaluate the effectiveness of image harmonization approaches with evaluation metrics that both measure image style harmonization and assess the preservation of anatomical structures. Experimental results demonstrate the effectiveness of our method in preserving patient anatomy while adjusting the image intensities to a new clinical site. Our general harmonization model can be used on unseen data from new sites, making it a valuable tool for real-world medical applications and clinical trials.