Intersectionality is a framework that analyzes how interlocking systems of power and oppression affect individuals along overlapping dimensions including race, gender, sexual orientation, class, and disability. Intersectionality theory therefore implies it is important that fairness in artificial intelligence systems be protected with regard to multi-dimensional protected attributes. However, the measurement of fairness becomes statistically challenging in the multi-dimensional setting due to data sparsity, which increases rapidly in the number of dimensions, and in the values per dimension. We present a Bayesian probabilistic modeling approach for the reliable, data-efficient estimation of fairness with multi-dimensional protected attributes, which we apply to novel intersectional fairness metrics. Experimental results on census data and the COMPAS criminal justice recidivism dataset demonstrate the utility of our methodology, and show that Bayesian methods are valuable for the modeling and measurement of fairness in an intersectional context.
Large-scale algorithmic decision making, often driven by machine learning on consumer data, has increasingly run afoul of various social norms, laws and regulations. A prominent concern is when a learned model exhibits discrimination against some demographic group, perhaps based on race or gender. Concerns over such algorithmic discrimination have led to a recent flurry of research on fairness in machine learning, which includes both new tools and methods for designing fair models, and studies of the tradeoffs between predictive accuracy and fairness [ACM, 2019]. At the same time, both recent and longstanding laws and regulations often restrict the use of "sensitive" or protected attributes in algorithmic decision-making. U.S. law prevents the use of race in the development or deployment of consumer lending or credit scoring models, and recent provisions in the E.U. General Data Protection Regulation (GDPR) restrict or prevent even the collection of racial data for consumers. These two developments -- the demand for non-discriminatory algorithms and models on the one hand, and the restriction on the collection or use of protected attributes on the other -- present technical conundrums, since the most straightforward methods for ensuring fairness generally require knowing or using the attribute being protected. It seems difficult to guarantee that a trained model is not discriminating against, say, a racial group if we cannot even identify members of that group in the data. 1 A recent paper [Kilbertus et al., 2018] made these cogent observations, and proposed an interesting solutionemploying the cryptographic tool of secure multiparty computation (commonly abbreviated MPC). In their model, we imagine a commercial entity with access to consumer data that excludes race, but this entity would like to build a predictive model for, say, commercial lending, under the constraint that the model be non-discriminatory by race with respect to some standard fairness notion (e.g.
For the past five years, Cynthia Dwork has been working to create a new field of research on algorithmic fairness. Theoretical computer science can be as remote and abstract as pure mathematics, but new research often begins in response to concrete, real-world problems. Such is the case with the work of Cynthia Dwork. Over the course of a distinguished career, Dwork has crafted rigorous solutions to dilemmas that crop up at the messy interface between computing power and human activity. She is most famous for her invention in the early to mid-2000s of "differential privacy," a set of techniques that safeguard the privacy of individuals in a large database.
Theoretical computer science can be as remote and abstract as pure mathematics, but new research often begins in response to concrete, real-world problems. Such is the case with the work of Cynthia Dwork. Over the course of a distinguished career, Dwork has crafted rigorous solutions to dilemmas that crop up at the messy interface between computing power and human activity. She is most famous for her invention in the early to mid-2000s of "differential privacy," a set of techniques that safeguard the privacy of individuals in a large database. Differential privacy ensures, for example, that a person can contribute their genetic information to a medical database without fear that anyone analyzing the database will be able to figure out which genetic information is hers--or even whether she has participated in the database at all.
Even though this particular data gathering method protects users in theory, it would eventually be up to the user to participate when the algorithm is introduced with MacOS Sierra. The technique, which is a well-established mathematical process employed by surveyors and statisticians, is expected to make Apple's text, emoji and deep link suggestions better. In addition to the clarification about the privacy technique, the company also said that images from a user's cloud storage are off-limits and are not being used to feed and improve image recognition algorithms. While Apple's precise image-studying practices and other AI training ways are not known, the company is clearly stepping up its predictive algorithm game.