Speckle2Self: Self-Supervised Ultrasound Speckle Reduction Without Clean Data

Li, Xuesong, Navab, Nassir, Jiang, Zhongliang

arXiv.org Artificial Intelligence 

Image denoising is a fundamental task in computer vision, particularly in medical ultrasound (US) imaging, where speckle noise significantly degrades image quality. Although recent advancements in deep neural networks have led to substantial improvements in denoising for natural images, these methods cannot be directly applied to US speckle noise, as it is not purely random. Instead, US speckle arises from complex wave interference within the body microstructure, making it tissue-dependent. This dependency means that obtaining two independent noisy observations of the same scene, as required by pioneering Noise2Noise, is not feasible. Additionally, blind-spot networks also cannot handle US speckle noise due to its high spatial dependency. To address this challenge, we introduce Speckle2Self, a novel self-supervised algorithm for speckle reduction using only single noisy observations. The key insight is that applying a multi-scale perturbation (MSP) operation introduces tissue-dependent variations in the speckle pattern across di fferent scales, while preserving the shared anatomical structure. This enables e ff ective speckle suppression by modeling the clean image as a low-rank signal and isolating the sparse noise component. To demonstrate its e ff ectiveness, Speckle2Self is comprehensively compared with conventional filter-based denoising algorithms and SOT A learning-based methods, using both realistic simulated US images and human carotid US images. Additionally, data from multiple US machines are employed to evaluate model generalization and adaptability to images from unseen domains. Introduction Medical ultrasound (US) is one of the most important imaging modalities in modern clinical practices due to its a ff ord-ability, non-invasiveness and real-time capabilities Jiang et al. (2023a); Bi et al. (2023b). US imaging visualises internal anatomical structures by emitting high-frequency acoustic waves (typically 2 15 MHz) into the body and detecting echoes scattered from tissue interfaces Szabo (2013). Compared to Computed Tomography (CT) and Magnetic Resonance Imaging (MRI), US images generally su ff er from lower image quality Kang et al. (2024); Stevens et al. (2024); Calis et al. (2025); Mwikirize et al. (2018), primarily due to speckle noise--one of the most prominent artefacts in B-mode imaging. This speckle noise arises from the coherent summation of echoes scattered by small-scale tissue structures (e.g., cells) and manifests as grainy patterns that degrade image clarity and contrast Krissian et al. (2005). This work involved human subjects in its research. Approval of all ethical and experimental procedures and protocols was granted by Institutional Review Board, No. 2022-87-S-KK, Declaration of Helsinki.