skin lesion classification
Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting
Paxton, Kuniko, Dehghani, Zeinab, Aslansefat, Koorosh, Thakker, Dhavalkumar, Papadopoulos, Yiannis
Skin color has historically been a focal point of discrimination, yet fairness research in machine learning for medical imaging often relies on coarse subgroup categories, overlooking individual-level variations. Such group-based approaches risk obscuring biases faced by outliers within subgroups. This study introduces a distribution-based framework for evaluating and mitigating individual fairness in skin lesion classification. We treat skin tone as a continuous attribute rather than a categorical label, and employ kernel density estimation (KDE) to model its distribution. We further compare twelve statistical distance metrics to quantify disparities between skin tone distributions and propose a distance-based reweighting (DRW) loss function to correct underrepresentation in minority tones. Experiments across CNN and Transformer models demonstrate: (i) the limitations of categorical reweighting in capturing individual-level disparities, and (ii) the superior performance of distribution-based reweighting, particularly with Fidelity Similarity (FS), Wasserstein Distance (WD), Hellinger Metric (HM), and Harmonic Mean Similarity (HS). These findings establish a robust methodology for advancing fairness at individual level in dermatological AI systems, and highlight broader implications for sensitive continuous attributes in medical image analysis.
Towards Explainable Skin Cancer Classification: A Dual-Network Attention Model with Lesion Segmentation and Clinical Metadata Fusion
Atiq, Md. Enamul, Fattah, Shaikh Anowarul
Skin cancer is a life-threatening disease where early detection significantly improves patient outcomes. Automated diagnosis from dermoscopic images is challenging due to high intra-class variability and subtle inter-class differences. Many deep learning models operate as "black boxes," limiting clinical trust. In this work, we propose a dual-encoder attention-based framework that leverages both segmented lesions and clinical metadata to enhance skin lesion classification in terms of both accuracy and interpretability. A novel Deep-UNet architecture with Dual Attention Gates (DAG) and Atrous Spatial Pyramid Pooling (ASPP) is first employed to segment lesions. The classification stage uses two DenseNet201 encoders-one on the original image and another on the segmented lesion whose features are fused via multi-head cross-attention. This dual-input design guides the model to focus on salient pathological regions. In addition, a transformer-based module incorporates patient metadata (age, sex, lesion site) into the prediction. We evaluate our approach on the HAM10000 dataset and the ISIC 2018 and 2019 challenges. The proposed method achieves state-of-the-art segmentation performance and significantly improves classification accuracy and average AUC compared to baseline models. To validate our model's reliability, we use Gradient-weighted Class Activation Mapping (Grad-CAM) to generate heatmaps. These visualizations confirm that our model's predictions are based on the lesion area, unlike models that rely on spurious background features. These results demonstrate that integrating precise lesion segmentation and clinical data with attention-based fusion leads to a more accurate and interpretable skin cancer classification model.
Robustness and sex differences in skin cancer detection: logistic regression vs CNNs
Pedersen, Nikolette, Sydendal, Regitze, Wulff, Andreas, Raumanns, Ralf, Petersen, Eike, Cheplygina, Veronika
Deep learning has been reported to achieve high performances in the detection of skin cancer, yet many challenges regarding the reproducibility of results and biases remain. This study is a replication (different data, same analysis) of a previous study on Alzheimer's disease detection, which studied the robustness of logistic regression (LR) and convolutional neural networks (CNN) across patient sexes. We explore sex bias in skin cancer detection, using the PAD-UFES-20 dataset with LR trained on handcrafted features reflecting dermatological guidelines (ABCDE and the 7-point checklist), and a pre-trained ResNet-50 model. We evaluate these models in alignment with the replicated study: across multiple training datasets with varied sex composition to determine their robustness. Our results show that both the LR and the CNN were robust to the sex distribution, but the results also revealed that the CNN had a significantly higher accuracy (ACC) and area under the receiver operating characteristics (AUROC) for male patients compared to female patients. The data and relevant scripts to reproduce our results are publicly available (https://github.com/
Toward Accessible Dermatology: Skin Lesion Classification Using Deep Learning Models on Mobile-Acquired Images
Newaz, Asif, Ishti, Masum Mushfiq, Azam, A Z M Ashraful, Adib, Asif Ur Rahman
Abstract--Skin diseases are among the most prevalent health concerns worldwide, yet conventional diagnostic methods are often costly, complex, and unavailable in low-resource settings. Automated classification using deep learning has emerged as a promising alternative, but existing studies are mostly limited to dermoscopic datasets and a narrow range of disease classes. In this work, we curate a large dataset of over 50 skin disease categories captured with mobile devices, making it more representative of real-world conditions. We evaluate multiple convo-lutional neural networks and Transformer-based architectures, demonstrating that Transformer models, particularly the Swin Transformer, achieve superior performance by effectively capturing global contextual features. T o enhance interpretability, we incorporate Gradient-weighted Class Activation Mapping (Grad-CAM), which highlights clinically relevant regions and provides transparency in model predictions. Our results underscore the potential of Transformer-based approaches for mobile-acquired skin lesion classification, paving the way toward accessible AIassisted dermatological screening and early diagnosis in resource-limited environments. Skin diseases are among the most common health problems worldwide, impacting millions of individuals across the world. According to the World Health Organization(WHO), more than 900 million people are affected by various skin conditions, with skin disorders ranking as the fourth leading cause of nonfatal disease burden worldwide [1]. These diseases range from mild infections to life-threatening chronic disorders.
Enhancing Fairness in Skin Lesion Classification for Medical Diagnosis Using Prune Learning
Paxton, Kuniko, Aslansefat, Koorosh, Thakker, Dhavalkumar, Papadopoulos, Yiannis, Maslekar, Tanaya
-- Recent advances in deep learning have significantly improved the accuracy of skin lesion classification models, supporting medical diagnoses and promoting equitable healthcare. However, concerns remain about potential biases related to skin color, which can impact diagnostic outcomes. Ensuring fairness is challenging due to difficulties in classifying skin tones, high computational demands, and the complexity of objectively verifying fairness. To address these challenges, we propose a fairness algorithm for skin lesion classification that overcomes the challenges associated with achieving diagnostic fairness across varying skin tones. By calculating the skewness of the feature map in the convolution layer of the VGG (Visual Geometry Group) network and the patches and the heads of the Vision Transformer, our method reduces unnecessary channels related to skin tone, focusing instead on the lesion area. This approach lowers computational costs and mitigates bias without relying on conventional statistical methods. It potentially reduces model size while maintaining fairness, making it more practical for real-world applications. In recent years, the predictive performance of deep learning models for skin lesion classification has improved significantly [1], [2]. These models have the potential to assist medical professionals in diagnosing diseases more efficiently, allowing for timely medical diagnosis. Furthermore, they can enable mobile and web-based self-diagnosis tools, improving access to healthcare and promoting equitable delivery in low-and middle-income regions. However, despite these advances, concerns persist that algorithmic and data biases can compromise fairness in healthcare [3], [4].
Semi-Supervised Learning with Online Knowledge Distillation for Skin Lesion Classification
Deep Learning has emerged as a promising approach for skin lesion analysis. However, existing methods mostly rely on fully supervised learning, requiring extensive labeled data, which is challenging and costly to obtain. To alleviate this annotation burden, this study introduces a novel semi-supervised deep learning approach that integrates ensemble learning with online knowledge distillation for enhanced skin lesion classification. Our methodology involves training an ensemble of convolutional neural network models, using online knowledge distillation to transfer insights from the ensemble to its members. This process aims to enhance the performance of each model within the ensemble, thereby elevating the overall performance of the ensemble itself. Post-training, any individual model within the ensemble can be deployed at test time, as each member is trained to deliver comparable performance to the ensemble. This is particularly beneficial in resource-constrained environments. Experimental results demonstrate that the knowledge-distilled individual model performs better than independently trained models. Our approach demonstrates superior performance on both the \emph{International Skin Imaging Collaboration} 2018 and 2019 public benchmark datasets, surpassing current state-of-the-art results. By leveraging ensemble learning and online knowledge distillation, our method reduces the need for extensive labeled data while providing a more resource-efficient solution for skin lesion classification in real-world scenarios.
Domain Adaptive Skin Lesion Classification via Conformal Ensemble of Vision Transformers
Zoravar, Mehran, Alijani, Shadi, Najjaran, Homayoun
Exploring the trustworthiness of deep learning models is crucial, especially in critical domains such as medical imaging decision support systems. Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees. However, conformal prediction results face challenges due to the backbone model's struggles in domain-shifted scenarios, such as variations in different sources. To aim this challenge, this paper proposes a novel framework termed Conformal Ensemble of Vision Transformers (CE-ViTs) designed to enhance image classification performance by prioritizing domain adaptation and model robustness, while accounting for uncertainty. The proposed method leverages an ensemble of vision transformer models in the backbone, trained on diverse datasets including HAM10000, Dermofit, and Skin Cancer ISIC datasets. This ensemble learning approach, calibrated through the combined mentioned datasets, aims to enhance domain adaptation through conformal learning. Experimental results underscore that the framework achieves a high coverage rate of 90.38\%, representing an improvement of 9.95\% compared to the HAM10000 model. This indicates a strong likelihood that the prediction set includes the true label compared to singular models. Ensemble learning in CE-ViTs significantly improves conformal prediction performance, increasing the average prediction set size for challenging misclassified samples from 1.86 to 3.075.
An Attention-Guided Deep Learning Approach for Classifying 39 Skin Lesion Types
Hanum, Sauda Adiv, Dey, Ashim, Kabir, Muhammad Ashad
The skin, as the largest organ of the human body, is vulnerable to a diverse array of conditions collectively known as skin lesions, which encompass various dermatoses. Diagnosing these lesions presents significant challenges for medical practitioners due to the subtle visual differences that are often imperceptible to the naked eye. While not all skin lesions are life-threatening, certain types can act as early indicators of severe diseases, including skin cancers, underscoring the critical need for timely and accurate diagnostic methods. Deep learning algorithms have demonstrated remarkable potential in facilitating the early detection and prognosis of skin lesions. This study advances the field by curating a comprehensive and diverse dataset comprising 39 categories of skin lesions, synthesized from five publicly available datasets. Using this dataset, the performance of five state-of-the-art deep learning models -- MobileNetV2, Xception, InceptionV3, EfficientNetB1, and Vision Transformer - is rigorously evaluated. To enhance the accuracy and robustness of these models, attention mechanisms such as the Efficient Channel Attention (ECA) and the Convolutional Block Attention Module (CBAM) are incorporated into their architectures. Comprehensive evaluation across multiple performance metrics reveals that the Vision Transformer model integrated with CBAM outperforms others, achieving an accuracy of 93.46%, precision of 94%, recall of 93%, F1-score of 93%, and specificity of 93.67%. These results underscore the significant potential of the proposed system in supporting medical professionals with accurate and efficient prognostic tools for diagnosing a broad spectrum of skin lesions. The dataset and code used in this study can be found at https://github.com/akabircs/Skin-Lesions-Classification.
Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution
Özdemir, Zeynep, Keles, Hacer Yalim, Tanrıöver, Ömer Özgür
Over the past decade, the field of medical image analysis has witnessed remarkable advancements, primarily driven by the development of deep convolutional neural networks and the availability of extensive labeled image datasets. These advancements have notably impacted various tasks, including organ segmentation [1, 2], tumor segmentation [3, 4], and disease detection [5, 6]. Although abundant data exists for common diseases, a significant gap persists in data availability for the over 6,000 known rare diseases, affecting approximately 7% of the global population [7]. The diagnosis of these rare diseases, including some skin conditions, presents unique challenges, particularly due to the limited number of clinical examples available for training deep learning models. The automatic classification of skin lesions exemplifies these challenges, as it is complicated by the long-tailed distribution of skin disease datasets, the subtle variations in lesion appearances, and the overall scarcity of sufficient image data [8]. Various studies have been conducted to address the problem of skin disease classification using deep learning approaches. Recent advancements in this field are mainly in three categories: methods based on transfer learning [12, 13], those relying on few-shot learning [8, 14-19], and approaches using cross-domain few-shot learning [20]. The state of the art models in this domain, such as Meta-DermDiagnosis, MetaMed, and PCN models [8, 14, 18], are designed to extract and learn high-level, domain-specific features during their training process.
Transformer-based interpretable multi-modal data fusion for skin lesion classification
Cheslerean-Boghiu, Theodor, Fleischmann, Melia-Evelina, Willem, Theresa, Lasser, Tobias
A lot of deep learning (DL) research these days is mainly focused on improving quantitative metrics regardless of other factors. In human-centered applications, like skin lesion classification in dermatology, DL-driven clinical decision support systems are still in their infancy due to the limited transparency of their decision-making process. Moreover, the lack of procedures that can explain the behavior of trained DL algorithms leads to almost no trust from clinical physicians. To diagnose skin lesions, dermatologists rely on visual assessment of the disease and the data gathered from the patient's anamnesis. Data-driven algorithms dealing with multi-modal data are limited by the separation of feature-level and decision-level fusion procedures required by convolutional architectures. To address this issue, we enable single-stage multi-modal data fusion via the attention mechanism of transformer-based architectures to aid in diagnosing skin diseases. Our method beats other state-of-the-art single- and multi-modal DL architectures in image-rich and patient-data-rich environments. Additionally, the choice of the architecture enables native interpretability support for the classification task both in the image and metadata domain with no additional modifications necessary.