Simple is what you need for efficient and accurate medical image segmentation
Yu, Xiang, Chen, Yayan, He, Guannan, Zeng, Qing, Qin, Yue, Liang, Meiling, Luo, Dandan, Liao, Yimei, Ren, Zeyu, Kang, Cheng, Yang, Delong, Liang, Bocheng, Pu, Bin, Yuan, Ying, Li, Shengli
–arXiv.org Artificial Intelligence
--While modern segmentation models often prioritize performance over practicality, we advocate a design philosophy prioritizing simplicity and efficiency, and attempted high-performance segmentation model design. This paper presents SimpleUNet, a scalable ultra-lightweight medical image segmentation model with three key innovations: (1) A partial feature selection mechanism in skip connections for redundancy reduction while enhancing segmentation performance; (2) A fixed-width architecture that prevents exponential parameter growth across network stages; (3) An adaptive feature fusion module achieving enhanced representation with minimal computational overhead. With a record-breaking 16 KB parameter configuration, Simple-UNet outperforms LBUNet and other lightweight benchmarks across multiple public datasets. The 0.67 MB variant achieves superior efficiency (8.60 GFLOPs) and accuracy, attaining a mean DSC/IoU of 85.76 %/75.60% on multi-center breast lesion datasets, surpassing both U-Net and TransUNet. Evaluations on skin lesion datasets (ISIC 2017/2018: mDice 84.86 %/88.77%) and endoscopic polyp segmentation (KV ASIR-SEG: 86.46 % /76.48% mDice/mIoU) confirm consistent dominance over state-of-the-art models. This work demonstrates that extreme model compression need not compromise performance, providing new insights for efficient and accurate medical image segmentation. Codes can be found at https://github.com/Frankyu5666666/SimpleUNet. N medical image segmentation, U-Net has been acknowledged as a successful and robust framework distinguished by its unique U-shaped architecture comprising an encoder-decoder pathway [1]-[4]. Generally, the skip connections between the encoder and the decoder are considered to concatenate the lower-level features from the decoder to the high-level features from the encoder for hierarchical feature fusion, mitigating issues like gradient vanishing or explosion, and thus leading to higher performance. The modular design has made U-Net a popular choice for semantic segmentation, especially in medical image segmentation scenarios where available datasets are limited. However, the model's progressively increasing width and the feature concatenation mechanism inherently introduce more parameters in the decoder path for information fusion, potentially resulting in information redundancy and reduced efficiency. Recent advances have sought to enhance segmentation performance by introducing novel computing operations and attention modules [5]-[8]. Although these innovations have improved accuracy, they often come at an expensive cost regarding parameter and computational complexity that challenges practical deployment in resource-constrained environments. In light of these limitations, researchers in the area endeavored to develop lightweight yet high-performance models for medical image segmentation, such as those utilizing depthwise convolution and state-space-based models [9]-[11].
arXiv.org Artificial Intelligence
Jun-17-2025
- Country:
- Asia > China
- Guangdong Province > Shenzhen (0.04)
- Hong Kong (0.04)
- Jilin Province > Changchun (0.04)
- Sichuan Province > Chengdu (0.04)
- Yunnan Province > Kunming (0.04)
- Europe
- Czechia > Prague (0.04)
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Spain > Andalusia
- Granada Province > Granada (0.04)
- North America > United States
- Georgia > Fulton County
- Atlanta (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- Georgia > Fulton County
- Asia > China
- Genre:
- Research Report
- New Finding (0.46)
- Promising Solution (0.34)
- Research Report
- Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Technology: