Collaborating Authors


Welcome (back) to Parity!


This article is reposted from the Parity Substack. Follow us there for updates. Welcome back to Parity, now coming from the brand new team. We took over for Dr. Rumman Chowdhury while she's off solving some of the world's most pressing (and challenging) algorithmic issues at Twitter. Rumman's still with us as our lead investor and board member, helping us immensely as we grow.

RStudio AI Blog: Starting to think about AI Fairness


The topic of AI fairness metrics is as important to society as it is confusing. Confusing it is due to a number of reasons: terminological proliferation, abundance of formulae, and last not least the impression that everyone else seems to know what they're talking about. This text hopes to counteract some of that confusion by starting from a common-sense approach of contrasting two basic positions: On the one hand, the assumption that dataset features may be taken as reflecting the underlying concepts ML practitioners are interested in; on the other, that there inevitably is a gap between concept and measurement, a gap that may be bigger or smaller depending on what is being measured. In contrasting these fundamental views, we bring together concepts from ML, legal science, and political philosophy.

Enforcing fairness in private federated learning via the modified method of differential multipliers Machine Learning

Federated learning with differential privacy, or private federated learning, provides a strategy to train machine learning models while respecting users' privacy. However, differential privacy can disproportionately degrade the performance of the models on under-represented groups, as these parts of the distribution are difficult to learn in the presence of noise. Existing approaches for enforcing fairness in machine learning models have considered the centralized setting, in which the algorithm has access to the users' data. This paper introduces an algorithm to enforce group fairness in private federated learning, where users' data does not leave their devices. First, the paper extends the modified method of differential multipliers to empirical risk minimization with fairness constraints, thus providing an algorithm to enforce fairness in the central setting. Then, this algorithm is extended to the private federated learning setting. The proposed algorithm, FPFL, is tested on a federated version of the Adult dataset and an "unfair" version of the FEMNIST dataset. The experiments on these datasets show how private federated learning accentuates unfairness in the trained models, and how FPFL is able to mitigate such unfairness.

Neon Genesis


When it comes to generating 3D computer graphics, there's no shortage of software options available. How you decide which software to use is generally priority calculus -- creating meshes for an industrial use case? You may want CAD-specific software like AutoCAD. But what if you want to do everything? And what if you'd prefer to do everything under an open source license (aka free)?

The Price of Diversity Machine Learning

Systemic bias with respect to gender, race and ethnicity, often unconscious, is prevalent in datasets involving choices among individuals. Consequently, society has found it challenging to alleviate bias and achieve diversity in a way that maintains meritocracy in such settings. We propose (a) a novel optimization approach based on optimally flipping outcome labels and training classification models simultaneously to discover changes to be made in the selection process so as to achieve diversity without significantly affecting meritocracy, and (b) a novel implementation tool employing optimal classification trees to provide insights on which attributes of individuals lead to flipping of their labels, and to help make changes in the current selection processes in a manner understandable by human decision makers. We present case studies on three real-world datasets consisting of parole, admissions to the bar and lending decisions, and demonstrate that the price of diversity is low and sometimes negative, that is we can modify our selection processes in a way that enhances diversity without affecting meritocracy significantly, and sometimes improving it.

Non-Comparative Fairness for Human-Auditing and Its Relation to Traditional Fairness Notions Artificial Intelligence

Bias evaluation in machine-learning based services (MLS) based on traditional algorithmic fairness notions that rely on comparative principles is practically difficult, making it necessary to rely on human auditor feedback. However, in spite of taking rigorous training on various comparative fairness notions, human auditors are known to disagree on various aspects of fairness notions in practice, making it difficult to collect reliable feedback. This paper offers a paradigm shift to the domain of algorithmic fairness via proposing a new fairness notion based on the principle of non-comparative justice. In contrary to traditional fairness notions where the outcomes of two individuals/groups are compared, our proposed notion compares the MLS' outcome with a desired outcome for each input. This desired outcome naturally describes a human auditor's expectation, and can be easily used to evaluate MLS on crowd-auditing platforms. We show that any MLS can be deemed fair from the perspective of comparative fairness (be it in terms of individual fairness, statistical parity, equal opportunity or calibration) if it is non-comparatively fair with respect to a fair auditor. We also show that the converse holds true in the context of individual fairness. Given that such an evaluation relies on the trustworthiness of the auditor, we also present an approach to identify fair and reliable auditors by estimating their biases with respect to a given set of sensitive attributes, as well as quantify the uncertainty in the estimation of biases within a given MLS. Furthermore, all of the above results are also validated on COMPAS, German credit and Adult Census Income datasets. In recent years, the rapid advancements in the fields of artificial intelligence (AI) and machine learning (ML) have resulted in the proliferation of algorithmic decision making in many practical applications. Examples include decision-support systems for judges whether or not to release a prisoner on parole [1], automated financial decisions in banks regarding granting or denying loans [2], and product recommendations by e-commerce websites [3].

An Empirical Investigation into Deep and Shallow Rule Learning Artificial Intelligence

Inductive rule learning is arguably among the most traditional paradigms in machine learning. Although we have seen considerable progress over the years in learning rule-based theories, all state-of-the-art learners still learn descriptions that directly relate the input features to the target concept. In the simplest case, concept learning, this is a disjunctive normal form (DNF) description of the positive class. While it is clear that this is sufficient from a logical point of view because every logical expression can be reduced to an equivalent DNF expression, it could nevertheless be the case that more structured representations, which form deep theories by forming intermediate concepts, could be easier to learn, in very much the same way as deep neural networks are able to outperform shallow networks, even though the latter are also universal function approximators. In this paper, we empirically compare deep and shallow rule learning with a uniform general algorithm, which relies on greedy mini-batch based optimization. Our experiments on both artificial and real-world benchmark data indicate that deep rule networks outperform shallow networks.

Costs and Benefits of Wasserstein Fair Regression Machine Learning

Real-world applications of machine learning tools in high-stakes domains are often regulated to be fair, in the sense that the predicted target should satisfy some quantitative notion of parity with respect to a protected attribute. However, the exact tradeoff between fairness and accuracy with a real-valued target is not clear. In this paper, we characterize the inherent tradeoff between statistical parity and accuracy in the regression setting by providing a lower bound on the error of any fair regressor. Our lower bound is sharp, algorithm-independent, and admits a simple interpretation: when the moments of the target differ between groups, any fair algorithm has to make a large error on at least one of the groups. We further extend this result to give a lower bound on the joint error of any (approximately) fair algorithm, using the Wasserstein distance to measure the quality of the approximation. On the upside, we establish the first connection between individual fairness, accuracy parity, and the Wasserstein distance by showing that if a regressor is individually fair, it also approximately verifies the accuracy parity, where the gap is given by the Wasserstein distance between the two groups. Inspired by our theoretical results, we develop a practical algorithm for fair regression through the lens of representation learning, and conduct experiments on a real-world dataset to corroborate our findings.

The zoo of Fairness metrics in Machine Learning Machine Learning

In recent years, the problem of addressing fairness in Machine Learning (ML) and automatic decision-making has attracted a lot of attention in the scientific communities dealing with Artificial Intelligence. A plethora of different definitions of fairness in ML have been proposed, that consider different notions of what is a "fair decision" in situations impacting individuals in the population. The precise differences, implications and "orthogonality" between these notions have not yet been fully analyzed in the literature. In this work, we try to make some order out of this zoo of definitions.

Gender gap narrows in tech but COVID-19 is a major setback for parity


The World Economic Forum's (WEF) annual report on the Global Gender Gap warns that the COVID-19 pandemic has added several decades before parity is reached -- instead of 99.5 years it will now take 135.6 years. The tech sector, however, has shown some increases but more can be done according to top executives at WEF. Moving forward from a white male work culture is a requirement for delivering real business value. The report blames the economic impact of lockdowns around the world, which have impacted sectors that employ many women; and also the extra care needed for family members which falls on women to provide. Tech companies have been publishing annual reports on the diversity of their workforces and although progress has been made it is not enough. Sheila Warren, head of blockchain and data policy at WEF, attacks a common myth that diversity in tech is hampered by a lack of qualified candidates.