equal distribution
BAD: BiAs Detection for Large Language Models in the context of candidate screening
Koh, Nam Ho, Plata, Joseph, Chai, Joyce
Application Tracking Systems (ATS) have allowed talent managers, recruiters, and college admissions committees to process large volumes of potential candidate applications efficiently. Traditionally, this screening process was conducted manually, creating major bottlenecks due to the quantity of applications and introducing many instances of human bias. The advent of large language models (LLMs) such as ChatGPT and the potential of adopting methods to current automated application screening raises additional bias and fairness issues that must be addressed. In this project, we wish to identify and quantify the instances of social bias in ChatGPT and other OpenAI LLMs in the context of candidate screening in order to demonstrate how the use of these models could perpetuate existing biases and inequalities in the hiring process.
Test for non-negligible adverse shifts
Statistical tests for dataset shift are susceptible to false alarms: they are sensitive to minor differences where there is in fact adequate sample coverage and predictive performance. We propose instead a robust framework for tests of dataset shift based on outlier scores, D-SOS for short. D-SOS detects adverse shifts and can identify false alarms caused by benign ones. It posits that a new (test) sample is not substantively worse than an old (training) sample, and not that the two are equal. The key idea is to reduce observations to outlier scores and compare contamination rates. Beyond comparing distributions, users can define what worse means in terms of predictive performance and other relevant notions. We show how versatile and practical D-SOS is for a wide range of real and simulated datasets. Unlike tests of equal distribution and of goodness-of-fit, the D-SOS tests are uniquely tailored to serve as robust performance metrics to monitor model drift and dataset shift.
Limitations of Pinned AUC for Measuring Unintended Bias
Borkan, Daniel, Dixon, Lucas, Li, John, Sorensen, Jeffrey, Thain, Nithum, Vasserman, Lucy
This report examines the Pinned AUC metric introduced in [2] and highlights some of its limitations. Pinned AUC provides a threshold-agnostic measure of unintended bias in a classification model, inspired by the ROC-AUC metric. However, as we highlight in this report, there are ways that the metric can obscure different kinds of unintended biases when the underlying class distributions on which bias is being measured are not carefully controlled. In [2], Pinned AUC is applied to a synthetically generated test set where all identity subgroups have identical representation of the classification labels. This method of controlling the class distributions avoids Pinned AUC's potential to obscure unintended biases. However, if the test data contains different distributions of classification labels between identities, Pinned AUC's measurement of bias can be skewed, either over or under representing the extent of unintended bias. In this report, the reasons for Pinned AUC's lack of robustness to variations in the class distributions are demonstrated. We also illustrate how unintended bias identified by Pinned AUC can be decomposed into the metrics presented in [1]. To avoid requiring careful class balancing, which is hard to do on real data, instead of using Pinned AUC, the threshold agnostic metrics presented in [1] can be used; these are robust to variations in the class distributions and provide a more nuanced view of unintended bias.