Goto

Collaborating Authors

 fairness loss



Fairness-Constrained Optimization Attack in Federated Learning

arXiv.org Artificial Intelligence

Federated learning (FL) is a privacy-preserving machine learning technique that facilitates collaboration among participants across demographics. FL enables model sharing, while restricting the movement of data. Since FL provides participants with independence over their training data, it becomes susceptible to poisoning attacks. Such collaboration also propagates bias among the participants, even unintentionally, due to different data distribution or historical bias present in the data. This paper proposes an intentional fairness attack, where a client maliciously sends a biased model, by increasing the fairness loss while training, even considering homogeneous data distribution. The fairness loss is calculated by solving an optimization problem for fairness metrics such as demographic parity and equalized odds. The attack is insidious and hard to detect, as it maintains global accuracy even after increasing the bias. We evaluate our attack against the state-of-the-art Byzantine-robust and fairness-aware aggregation schemes over different datasets, in various settings. The empirical results demonstrate the attack efficacy by increasing the bias up to 90\%, even in the presence of a single malicious client in the FL system.


Metric-Free Individual Fairness in Online Learning

Neural Information Processing Systems

Our results resolve an open question by Gillen et al. (2018) by showing that online learning under an unknown individual fairness constraint is possible even without assuming a strong parametric form of the


Metric-Free Individual Fairness in Online Learning

Neural Information Processing Systems

Our results resolve an open question by Gillen et al. (2018) by showing that online learning under an unknown individual fairness constraint is possible even without assuming a strong parametric form of the


Fair Sortition Made Transparent

Neural Information Processing Systems

Sortition is an age-old democratic paradigm, widely manifested today through the random selection of citizens' assemblies. Recently-deployed algorithms select assemblies maximally fairly, meaning that subject to demographic quotas, they give all potential participants as equal a chance as possible of being chosen. While these fairness gains can bolster the legitimacy of citizens' assemblies and facilitate their uptake, existing algorithms remain limited by their lack of transparency. To overcome this hurdle, in this work we focus on panel selection by uniform lottery, which is easy to realize in an observable way. By this approach, the final assembly is selected by uniformly sampling some pre-selected set of m possible assemblies. We provide theoretical guarantees on the fairness attainable via this type of uniform lottery, as compared to the existing maximally fair but opaque algorithms, for two different fairness objectives. We complement these results with experiments on real-world instances that demonstrate the viability of the uniform lottery approach as a method of selecting assemblies both fairly and transparently.


Privacy-Preserving Fair Synthetic Tabular Data

arXiv.org Artificial Intelligence

Sharing of tabular data containing valuable but private information is limited due to legal and ethical issues. Synthetic data could be an alternative solution to this sharing problem, as it is artificially generated by machine learning algorithms and tries to capture the underlying data distribution. However, machine learning models are not free from memorization and may introduce biases, as they rely on training data. Producing synthetic data that preserves privacy and fairness while maintaining utility close to the real data is a challenging task. This research simultaneously addresses both the privacy and fairness aspects of synthetic data, an area not explored by other studies. In this work, we present PF-WGAN, a privacy-preserving, fair synthetic tabular data generator based on the WGAN-GP model. We have modified the original WGAN-GP by adding privacy and fairness constraints forcing it to produce privacy-preserving fair data. This approach will enable the publication of datasets that protect individual's privacy and remain unbiased toward any particular group. We compared the results with three state-of-the-art synthetic data generator models in terms of utility, privacy, and fairness across four different datasets. We found that the proposed model exhibits a more balanced trade-off among utility, privacy, and fairness.


Exploring Fairness in Educational Data Mining in the Context of the Right to be Forgotten

arXiv.org Artificial Intelligence

Student data, which is a critical component in EDM research, can contain personal information, such as age and gender, as well as academic performance and activity data from online learning systems [24]. By offering valuable insights into student learning, EDM supports the development of more effective educational practices and policies, ultimately improving student outcomes. One of the most popular techniques in the previous works is incorporating machine learning techniques, which has achieved remarkable success in discovering intricate structures within educational datasets. However, in recent years, concerns about the fairness of deploying algorithmic decision-making in the educational context have emerged [2, 22, 27, 49]. Particularly, machine learning models can produce biased and unfair outcomes for certain student groups, significantly affecting their educational opportunities and achievements. Given that the data empowering EDM research can often contain personally identifiable and other sensitive information, there has been increased attention to privacy protection in recent years [37, 43]. Additionally, privacy legislation such as the California Consumer Privacy Act [39] and the former Right to be Forgotten [17] has granted users the right to erase the impact of their sensitive information from the trained models to protect their privacy. One approach to protecting users' privacy involves enabling the trained machine learning model to entirely forget Both authors contributed equally to this research.


Adaptive Boosting with Fairness-aware Reweighting Technique for Fair Classification

arXiv.org Artificial Intelligence

Machine learning methods based on AdaBoost have been widely applied to various classification problems across many mission-critical applications including healthcare, law and finance. However, there is a growing concern about the unfairness and discrimination of data-driven classification models, which is inevitable for classical algorithms including AdaBoost. In order to achieve fair classification, a novel fair AdaBoost (FAB) approach is proposed that is an interpretable fairness-improving variant of AdaBoost. We mainly investigate binary classification problems and focus on the fairness of three different indicators (i.e., accuracy, false positive rate and false negative rate). By utilizing a fairness-aware reweighting technique for base classifiers, the proposed FAB approach can achieve fair classification while maintaining the advantage of AdaBoost with negligible sacrifice of predictive performance. In addition, a hyperparameter is introduced in FAB to show preferences for the fairness-accuracy trade-off. An upper bound for the target loss function that quantifies error rate and unfairness is theoretically derived for FAB, which provides a strict theoretical support for the fairness-improving methods designed for AdaBoost. The effectiveness of the proposed method is demonstrated on three real-world datasets (i.e., Adult, COMPAS and HSLS) with respect to the three fairness indicators. The results are accordant with theoretic analyses, and show that (i) FAB significantly improves classification fairness at a small cost of accuracy compared with AdaBoost; and (ii) FAB outperforms state-of-the-art fair classification methods including equalized odds method, exponentiated gradient method, and disparate mistreatment method in terms of the fairness-accuracy trade-off.


Generalizing Fairness using Multi-Task Learning without Demographic Information

arXiv.org Artificial Intelligence

To ensure the fairness of machine learning systems, we can include a fairness loss during training based on demographic information associated with the training data. However, we cannot train debiased classifiers for most tasks since the relevant datasets lack demographic annotations. Can we utilize demographic data for a related task to improve the fairness of our target task? We demonstrate that demographic fairness objectives transfer to new tasks trained within a multi-task framework. We adapt a single-task fairness loss to a multi-task setting to exploit demographic labels from a related task in debiasing a target task. We explore different settings with missing demographic data and show how our loss can improve fairness even without in-task demographics, across various domains and tasks.


Improving Fairness in Image Classification via Sketching

arXiv.org Artificial Intelligence

Fairness is a fundamental requirement for trustworthy and human-centered Artificial Intelligence (AI) system. However, deep neural networks (DNNs) tend to make unfair predictions when the training data are collected from different sub-populations with different attributes (i.e. color, sex, age), leading to biased DNN predictions. We notice that such a troubling phenomenon is often caused by data itself, which means that bias information is encoded to the DNN along with the useful information (i.e. class information, semantic information). Therefore, we propose to use sketching to handle this phenomenon. Without losing the utility of data, we explore the image-to-sketching methods that can maintain useful semantic information for the target classification while filtering out the useless bias information. In addition, we design a fair loss to further improve the model fairness. We evaluate our method through extensive experiments on both general scene dataset and medical scene dataset. Our results show that the desired image-to-sketching method improves model fairness and achieves satisfactory results among state-of-the-art.