Dooley, Samuel
Robustness Disparities in Face Detection
Dooley, Samuel, Wei, George Z., Goldstein, Tom, Dickerson, John P.
Facial analysis systems have been deployed by large companies and critiqued by scholars and activists for the past decade. Many existing algorithmic audits examine the performance of these systems on later stage elements of facial analysis systems like facial recognition and age, emotion, or perceived gender prediction; however, a core component to these systems has been vastly understudied from a fairness perspective: face detection, sometimes called face localization. Since face detection is a pre-requisite step in facial analysis systems, the bias we observe in face detection will flow downstream to the other components like facial recognition and emotion prediction. Additionally, no prior work has focused on the robustness of these systems under various perturbations and corruptions, which leaves open the question of how various people are impacted by these phenomena. We present the first of its kind detailed benchmark of face detection systems, specifically examining the robustness to noise of commercial and academic models. We use both standard and recently released academic facial datasets to quantitatively analyze trends in face detection robustness. Across all the datasets and systems, we generally find that photos of individuals who are $\textit{masculine presenting}$, $\textit{older}$, of $\textit{darker skin type}$, or have $\textit{dim lighting}$ are more susceptible to errors than their counterparts in other identities.
Are Commercial Face Detection Models as Biased as Academic Models?
Dooley, Samuel, Wei, George Z., Goldstein, Tom, Dickerson, John P.
As facial recognition systems are deployed more widely, scholars and activists have studied their biases and harms. Audits are commonly used to accomplish this and compare the algorithmic facial recognition systems' performance against datasets with various metadata labels about the subjects of the images. Seminal works have found discrepancies in performance by gender expression, age, perceived race, skin type, etc. These studies and audits often examine algorithms which fall into two categories: academic models or commercial models. We present a detailed comparison between academic and commercial face detection systems, specifically examining robustness to noise. We find that state-of-the-art academic face detection models exhibit demographic disparities in their noise robustness, specifically by having statistically significant decreased performance on older individuals and those who present their gender in a masculine manner. When we compare the size of these disparities to that of commercial models, we conclude that commercial models -- in contrast to their relatively larger development budget and industry-level fairness commitments -- are always as biased or more biased than an academic model.
Comparing Human and Machine Bias in Face Recognition
Dooley, Samuel, Downing, Ryan, Wei, George, Shankar, Nathan, Thymes, Bradon, Thorkelsdottir, Gudrun, Kurtz-Miott, Tiye, Mattson, Rachel, Obiwumi, Olufemi, Cherepanova, Valeriia, Goldblum, Micah, Dickerson, John P, Goldstein, Tom
Much recent research has uncovered and discussed serious concerns of bias in facial analysis technologies, finding performance disparities between groups of people based on perceived gender, skin type, lighting condition, etc. These audits are immensely important and successful at measuring algorithmic bias but have two major challenges: the audits (1) use facial recognition datasets which lack quality metadata, like LFW and CelebA, and (2) do not compare their observed algorithmic bias to the biases of their human alternatives. In this paper, we release improvements to the LFW and CelebA datasets which will enable future researchers to obtain measurements of algorithmic bias that are not tainted by major flaws in the dataset (e.g. identical images appearing in both the gallery and test set). We also use these new data to develop a series of challenging facial identification and verification questions that we administered to various algorithms and a large, balanced sample of human reviewers. We find that both computer models and human survey participants perform significantly better at the verification task, generally obtain lower accuracy rates on dark-skinned or female subjects for both tasks, and obtain higher accuracy rates when their demographics match that of the question. Computer models are observed to achieve a higher level of accuracy than the survey participants on both tasks and exhibit bias to similar degrees as the human survey participants.
PreferenceNet: Encoding Human Preferences in Auction Design with Deep Learning
Peri, Neehar, Curry, Michael J., Dooley, Samuel, Dickerson, John P.
The design of optimal auctions is a problem of interest in economics, game theory and computer science. Despite decades of effort, strategyproof, revenue-maximizing auction designs are still not known outside of restricted settings. However, recent methods using deep learning have shown some success in approximating optimal auctions, recovering several known solutions and outperforming strong baselines when optimal auctions are not known. In addition to maximizing revenue, auction mechanisms may also seek to encourage socially desirable constraints such as allocation fairness or diversity. However, these philosophical notions neither have standardization nor do they have widely accepted formal definitions. In this paper, we propose PreferenceNet, an extension of existing neural-network-based auction mechanisms to encode constraints using (potentially human-provided) exemplars of desirable allocations. In addition, we introduce a new metric to evaluate an auction allocations' adherence to such socially desirable constraints and demonstrate that our proposed method is competitive with current state-of-the-art neural-network based auction designs. We validate our approach through human subject research and show that we are able to effectively capture real human preferences. Our code is available at https://github.com/neeharperi/PreferenceNet
The Affiliate Matching Problem: On Labor Markets where Firms are Also Interested in the Placement of Previous Workers
Dooley, Samuel, Dickerson, John P.
In many labor markets, workers and firms are connected via affiliative relationships. A management consulting firm wishes to both accept the best new workers but also place its current affiliated workers at strong firms. Similarly, a research university wishes to hire strong job market candidates while also placing its own candidates at strong peer universities. We model this affiliate matching problem in a generalization of the classic stable marriage setting by permitting firms to state preferences over not just which workers to whom they are matched, but also to which firms their affiliated workers are matched. Based on results from a human survey, we find that participants (acting as firms) give preference to their own affiliate workers in surprising ways that violate some assumptions of the classical stable marriage problem. This motivates a nuanced discussion of how stability could be defined in affiliate matching problems; we give an example of a marketplace which admits a stable match under one natural definition of stability, and does not for that same marketplace under a different, but still natural, definition. We conclude by setting a research agenda toward the creation of a centralized clearing mechanism in this general setting.