T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection

Varaganti, Manikanta, Vankayalapati, Amulya, Awad, Nour, Dion, Gregory R., Brattain, Laura J.

arXiv.org Artificial Intelligence 

T2ID-CAS: Diffusion Model and Class A ware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection Manikanta V araganti 1, Amulya V ankayalapati 2, Nour A wad 2, Gregory R. Dion 2, and Laura J. Brattain 1,3 1 Department of Computer Science, University of Central Florida, Orlando, FL, USA 2 Department of Otolaryngology Head Neck Surgery, University of Cincinnati College of Medicine, OH, USA 3 Department of Internal Medicine, University of Central Florida College of Medicine, Orlando, FL, USA Abstract -- Neck ultrasound (US) plays a vital role in airway management by providing non-invasive, real-time imaging that enables rapid and precise interventions. Deep learning-based anatomical landmark detection in neck US can further facilitate procedural efficiency. However, class imbalance within datasets, where key structures like tracheal rings and vocal folds are underrepresented, presents significant challenges for object detection models. T o address this, we propose T2ID-CAS, a hybrid approach that combines a text-to-image latent diffusion model with class-aware sampling to generate high-quality synthetic samples for underrepresented classes. This approach, rarely explored in the ultrasound domain, improves the representation of minority classes. Experimental results using YOLOv9 for anatomical landmark detection in neck US demonstrated that T2ID-CAS achieved a mean A verage Precision of 88.2, significantly surpassing the baseline of 66.