Goto

Collaborating Authors

 skin color


Mitigating Individual Skin Tone Bias in Skin Lesion Classification through Distribution-Aware Reweighting

Paxton, Kuniko, Dehghani, Zeinab, Aslansefat, Koorosh, Thakker, Dhavalkumar, Papadopoulos, Yiannis

arXiv.org Artificial Intelligence

Skin color has historically been a focal point of discrimination, yet fairness research in machine learning for medical imaging often relies on coarse subgroup categories, overlooking individual-level variations. Such group-based approaches risk obscuring biases faced by outliers within subgroups. This study introduces a distribution-based framework for evaluating and mitigating individual fairness in skin lesion classification. We treat skin tone as a continuous attribute rather than a categorical label, and employ kernel density estimation (KDE) to model its distribution. We further compare twelve statistical distance metrics to quantify disparities between skin tone distributions and propose a distance-based reweighting (DRW) loss function to correct underrepresentation in minority tones. Experiments across CNN and Transformer models demonstrate: (i) the limitations of categorical reweighting in capturing individual-level disparities, and (ii) the superior performance of distribution-based reweighting, particularly with Fidelity Similarity (FS), Wasserstein Distance (WD), Hellinger Metric (HM), and Harmonic Mean Similarity (HS). These findings establish a robust methodology for advancing fairness at individual level in dermatological AI systems, and highlight broader implications for sensitive continuous attributes in medical image analysis.


Enhancing Fairness in Skin Lesion Classification for Medical Diagnosis Using Prune Learning

Paxton, Kuniko, Aslansefat, Koorosh, Thakker, Dhavalkumar, Papadopoulos, Yiannis, Maslekar, Tanaya

arXiv.org Artificial Intelligence

-- Recent advances in deep learning have significantly improved the accuracy of skin lesion classification models, supporting medical diagnoses and promoting equitable healthcare. However, concerns remain about potential biases related to skin color, which can impact diagnostic outcomes. Ensuring fairness is challenging due to difficulties in classifying skin tones, high computational demands, and the complexity of objectively verifying fairness. To address these challenges, we propose a fairness algorithm for skin lesion classification that overcomes the challenges associated with achieving diagnostic fairness across varying skin tones. By calculating the skewness of the feature map in the convolution layer of the VGG (Visual Geometry Group) network and the patches and the heads of the Vision Transformer, our method reduces unnecessary channels related to skin tone, focusing instead on the lesion area. This approach lowers computational costs and mitigates bias without relying on conventional statistical methods. It potentially reduces model size while maintaining fairness, making it more practical for real-world applications. In recent years, the predictive performance of deep learning models for skin lesion classification has improved significantly [1], [2]. These models have the potential to assist medical professionals in diagnosing diseases more efficiently, allowing for timely medical diagnosis. Furthermore, they can enable mobile and web-based self-diagnosis tools, improving access to healthcare and promoting equitable delivery in low-and middle-income regions. However, despite these advances, concerns persist that algorithmic and data biases can compromise fairness in healthcare [3], [4].


Fairness-Aware Grouping for Continuous Sensitive Variables: Application for Debiasing Face Analysis with respect to Skin Tone

Shilova, Veronika, Malherbe, Emmanuel, Palma, Giovanni, Risser, Laurent, Loubes, Jean-Michel

arXiv.org Artificial Intelligence

Within a legal framework, fairness in datasets and models is typically assessed by dividing observations into predefined groups and then computing fairness measures (e.g., Disparate Impact or Equality of Odds with respect to gender). However, when sensitive attributes such as skin color are continuous, dividing into default groups may overlook or obscure the discrimination experienced by certain minority subpopulations. To address this limitation, we propose a fairness-based grouping approach for continuous (possibly multidimensional) sensitive attributes. By grouping data according to observed levels of discrimination, our method identifies the partition that maximizes a novel criterion based on inter-group variance in discrimination, thereby isolating the most critical subgroups. We validate the proposed approach using multiple synthetic datasets and demonstrate its robustness under changing population distributions - revealing how discrimination is manifested within the space of sensitive attributes. Furthermore, we examine a specialized setting of monotonic fairness for the case of skin color. Our empirical results on both CelebA and FFHQ, leveraging the skin tone as predicted by an industrial proprietary algorithm, show that the proposed segmentation uncovers more nuanced patterns of discrimination than previously reported, and that these findings remain stable across datasets for a given model. Finally, we leverage our grouping model for debiasing purpose, aiming at predicting fair scores with group-by-group post-processing. The results demonstrate that our approach improves fairness while having minimal impact on accuracy, thus confirming our partition method and opening the door for industrial deployment.


Evaluating Fairness and Mitigating Bias in Machine Learning: A Novel Technique using Tensor Data and Bayesian Regression

Paxton, Kuniko, Aslansefat, Koorosh, Thakker, Dhavalkumar, Papadopoulos, Yiannis

arXiv.org Artificial Intelligence

Fairness is a critical component of Trustworthy AI. In this paper, we focus on Machine Learning (ML) and the performance of model predictions when dealing with skin color. Unlike other sensitive attributes, the nature of skin color differs significantly. In computer vision, skin color is represented as tensor data rather than categorical values or single numerical points. However, much of the research on fairness across sensitive groups has focused on categorical features such as gender and race. This paper introduces a new technique for evaluating fairness in ML for image classification tasks, specifically without the use of annotation. To address the limitations of prior work, we handle tensor data, like skin color, without classifying it rigidly. Instead, we convert it into probability distributions and apply statistical distance measures. This novel approach allows us to capture fine-grained nuances in fairness both within and across what would traditionally be considered distinct groups. Additionally, we propose an innovative training method to mitigate the latent biases present in conventional skin tone categorization. This method leverages color distance estimates calculated through Bayesian regression with polynomial functions, ensuring a more nuanced and equitable treatment of skin color in ML models.


Using LLMs as prompt modifier to avoid biases in AI image generators

Peinl, René

arXiv.org Artificial Intelligence

This study examines how Large Language Models (LLMs) can reduce biases in text-to-image generation systems by modifying user prompts. We define bias as a model's unfair deviation from population statistics given neutral prompts. Our experiments with Stable Diffusion XL, 3.5 and Flux demonstrate that LLM-modified prompts significantly increase image diversity and reduce bias without the need to change the image generators themselves. While occasionally producing results that diverge from original user intent for elaborate prompts, this approach generally provides more varied interpretations of underspecified requests rather than superficial variations. The method works particularly well for less advanced image generators, though limitations persist for certain contexts like disability representation. All prompts and generated images are available at https://iisys-hof.github.io/llm-prompt-img-gen/


Kids are learning how to make their own little language models

MIT Technology Review

"What does it mean to have children see themselves as being builders of AI technologies and not just users?" says Shruti. The program starts out by using a pair of dice to demonstrate probabilistic thinking, a system of decision-making that accounts for uncertainty. Probabilistic thinking underlies the LLMs of today, which predict the most likely next word in a sentence. By teaching a concept like it, the program can help to demystify the workings of LLMs for kids and assist them in understanding that sometimes the model's choices are not perfect but the result of a series of probabilities. Students can modify each side of the dice to whatever variable they want.


Hi5: 2D Hand Pose Estimation with Zero Human Annotation

Hasan, Masum, Ozel, Cengiz, Long, Nina, Martin, Alexander, Potter, Samuel, Adnan, Tariq, Lee, Sangwu, Zadeh, Amir, Hoque, Ehsan

arXiv.org Artificial Intelligence

We propose a new large synthetic hand pose estimation dataset, Hi5, and a novel inexpensive method for collecting high-quality synthetic data that requires no human annotation or validation. Leveraging recent advancements in computer graphics, high-fidelity 3D hand models with diverse genders and skin colors, and dynamic environments and camera movements, our data synthesis pipeline allows precise control over data diversity and representation, ensuring robust and fair model training. We generate a dataset with 583,000 images with accurate pose annotation using a single consumer PC that closely represents real-world variability. Pose estimation models trained with Hi5 perform competitively on real-hand benchmarks while surpassing models trained with real data when tested on occlusions and perturbations. Our experiments show promising results for synthetic data as a viable solution for data representation problems in real datasets. Overall, this paper provides a promising new approach to synthetic data creation and annotation that can reduce costs and increase the diversity and quality of data for hand pose estimation.


Improving Fairness using Vision-Language Driven Image Augmentation

D'Incà, Moreno, Tzelepis, Christos, Patras, Ioannis, Sebe, Nicu

arXiv.org Artificial Intelligence

Fairness is crucial when training a deep-learning discriminative model, especially in the facial domain. Models tend to correlate specific characteristics (such as age and skin color) with unrelated attributes (downstream tasks), resulting in biases which do not correspond to reality. It is common knowledge that these correlations are present in the data and are then transferred to the models during training. This paper proposes a method to mitigate these correlations to improve fairness. To do so, we learn interpretable and meaningful paths lying in the semantic space of a pre-trained diffusion model (DiffAE) -- such paths being supervised by contrastive text dipoles. That is, we learn to edit protected characteristics (age and skin color). These paths are then applied to augment images to improve the fairness of a given dataset. We test the proposed method on CelebA-HQ and UTKFace on several downstream tasks with age and skin color as protected characteristics. As a proxy for fairness, we compute the difference in accuracy with respect to the protected characteristics. Quantitative results show how the augmented images help the model improve the overall accuracy, the aforementioned metric, and the disparity of equal opportunity. Code is available at: https://github.com/Moreno98/Vision-Language-Bias-Control.


AI Algorithms Are Biased Against Skin With Yellow Hues

WIRED

After evidence surfaced in 2018 that leading face-analysis algorithms were less accurate for people with darker skin, companies including Google and Meta adopted measures of skin tone to test the effectiveness of their AI software. New research from Sony suggests that those tests are blind to a crucial aspect of the diversity of human skin color. By expressing skin tone using only a sliding scale from lightest to darkest or white to black, today's common measures ignore the contribution of yellow and red hues to the range of human skin, according to Sony researchers. They found that generative AI systems, image-cropping algorithms, and photo analysis tools all struggle with yellower skin in particular. The same weakness could apply to a variety of technologies whose accuracy is proven to be affected by skin color, such as AI software for face recognition, body tracking, and deepfake detection, or gadgets like heart rate monitors and motion detectors. "If products are just being evaluated in this very one-dimensional way, there's plenty of biases that will go undetected and unmitigated," says Alice Xiang, lead research scientist and global head of AI Ethics at Sony.


Supreme Court struck down affirmative action, but that won't stop Harvard

FOX News

You probably think the Supreme Court just ended racial discrimination in university admissions, euphemistically called affirmative action, and a new day of equal treatment without regard to race or skin color has dawned. Yes, SCOTUS invalidated the race-conscious practices of Harvard and UNC, holding that under the 14th Amendment a "student must be treated based on his or her experiences as an individual – not on the basis of race." That is a very important statement of our guiding constitutional principles. Yet already schools like Harvard are suggesting they will skirt the ruling by considering applicants' experience with race as opposed to the applicants' race itself. These games are not surprising and have been in the works for months.