Generative deep learning for foundational video translation in ultrasound

Bhatnagar, Nikolina Tomic Roshni, Jain, Sarthak, Lau, Connor, Liu, Tien-Yu, Gambini, Laura, Arnaout, Rima

arXiv.org Artificial Intelligence 

Department of Medicine, Division of Cardiology Bakar Computational Health Sciences Institute UCSF - UC Berkeley Joint Program in Computational Precision Health Department of Radiology, Center for Intelligent Imaging University of California, San Francisco Corresponding Author Keywords: medical imaging, video translation, deep learning, image synthesis, ultrasound Word Count: 4129 Abstract Deep learning (DL) has the potential to revolutionize image acquisition and interpretation across medicine, h owever, attention to data imbalance and missin gness is required . U ltrasound data presents a particular challenge because in addition to different views and structures, it includes several sub - modalities -- such as greyscale and color flow doppler (CFD) -- that are often imbalanced in clinical studies . Image translation can help balance datasets but is challenging for ultrasound sub - modalities to date . Here, we present a generative method for ultrasound CFD - greyscale video translation, t rained on 5 4, 975 videos and tested on 8, 3 68 . The method developed leveraged pixel - wise, adversarial, and perceptual loses and utilized two networks: one for reconstructing anatomic structures and one for denoising to achieve realistic ultrasound imaging . A verage pairwise SSIM between synthetic videos and ground truth was 0.9 1 0.0 4 . Synthetic videos performed indistinguishably from real ones in DL classification and segmentation tasks and when evaluated by b linded clinical experts: F1 score was 0.9 for real and 0.89 for synthetic videos; Dice score between real and synthetic segmentation was 0.97. Overall c linician accuracy in distinguishing real vs synthetic videos was 54 6% (42 - 61%), indicating reali stic synthetic videos . Although trained only on heart videos, the model worked well on ultrasound spanning several clinical domains (av erage SSIM 0.91 0.0 5), demonstrating foundational abilit ies .