Gender Fairness of Machine Learning Algorithms for Pain Detection

Green, Dylan, Shang, Yuting, Cheong, Jiaee, Liu, Yang, Gunes, Hatice

arXiv.org Artificial Intelligence 

-- Automated pain detection through machine learning (ML) and deep learning (DL) algorithms holds significant potential in healthcare, particularly for patients unable to self-report pain levels. However, the accuracy and fairness of these algorithms across different demographic groups (e.g., gender) remain under-researched. This paper investigates the gender fairness of ML and DL models trained on the UNBC-McMaster Shoulder Pain Expression Archive Database, evaluating the performance of various models in detecting pain based solely on the visual modality of participants' facial expressions. We compare traditional ML algorithms, Linear Support V ector Machine (L SVM) and Radial Basis Function SVM (RBF SVM), with DL methods, Convolutional Neural Network (CNN) and Vision Transformer (ViT), using a range of performance and fairness metrics. While ViT achieved the highest accuracy and a selection of fairness metrics, all models exhibited gender-based biases. These findings highlight the persistent trade-off between accuracy and fairness, emphasising the need for fairness-aware techniques to mitigate biases in automated healthcare systems. Machine Learning (ML) has become an essential tool in modern healthcare, offering the potential to automate complex tasks, such as pain detection, through images and videos [39]. However, as these technologies are adopted, ensuring fairness becomes critical to avoid perpetuating or exacerbating existing biases [79], [9], [73]. ML fairness refers to the absence of prejudice or bias in a machine learning system concerning sensitive attributes such as gender, race, or age [57]. In pain detection models, fairness ensures that individuals across different demographic groups are equally likely to be correctly classified.