mitigate bias
Synthetic Prefixes to Mitigate Bias in Real-Time Neural Query Autocomplete
Rajan, Adithya, Liu, Xiaoyu, Verma, Prateek, Arora, Vibhu
We introduce a data-centric approach for mitigating presentation bias in real-time neural query autocomplete systems through the use of synthetic prefixes. These prefixes are generated from complete user queries collected during regular search sessions where autocomplete was not active. This allows us to enrich the training data for learning to rank models with more diverse and less biased examples. This method addresses the inherent bias in engagement signals collected from live query autocomplete interactions, where model suggestions influence user behavior. Our neural ranker is optimized for real-time deployment under strict latency constraints and incorporates a rich set of features, including query popularity, seasonality, fuzzy match scores, and contextual signals such as department affinity, device type, and vertical alignment with previous user queries. To support efficient training, we introduce a task-specific simplification of the listwise loss, reducing computational complexity from $O(n^2)$ to $O(n)$ by leveraging the query autocomplete structure of having only one ground-truth selection per prefix. Deployed in a large-scale e-commerce setting, our system demonstrates statistically significant improvements in user engagement, as measured by mean reciprocal rank and related metrics. Our findings show that synthetic prefixes not only improve generalization but also provide a scalable path toward bias mitigation in other low-latency ranking tasks, including related searches and query recommendations.
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- North America > United States > Washington > King County > Bellevue (0.04)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Bias Mitigation Agent: Optimizing Source Selection for Fair and Balanced Knowledge Retrieval
Singh, Karanbir, Muppiri, Deepak, Ngu, William
Large Language Models (LLMs) have transformed the field of artificial intelligence by unlocking the era of generative applications. Built on top of generative AI capabilities, Agentic AI represents a major shift toward autonomous, goal-driven systems that can reason, retrieve, and act. However, they also inherit the bias present in both internal and external information sources. This significantly affects the fairness and balance of retrieved information, and hence reduces user trust. To address this critical challenge, we introduce a novel Bias Mitigation Agent, a multi-agent system designed to orchestrate the workflow of bias mitigation through specialized agents that optimize the selection of sources to ensure that the retrieved content is both highly relevant and minimally biased to promote fair and balanced knowledge dissemination. The experimental results demonstrate an 81.82\% reduction in bias compared to a baseline naive retrieval strategy.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
Whence Is A Model Fair? Fixing Fairness Bugs via Propensity Score Matching
Peng, Kewen, Yang, Yicheng, Zhuo, Hao, Menzies, Tim
Fairness-aware learning aims to mitigate discrimination against specific protected social groups (e.g., those categorized by gender, ethnicity, age) while minimizing predictive performance loss. Despite efforts to improve fairness in machine learning, prior studies have shown that many models remain unfair when measured against various fairness metrics. In this paper, we examine whether the way training and testing data are sampled affects the reliability of reported fairness metrics. Since training and test sets are often randomly sampled from the same population, bias present in the training data may still exist in the test data, potentially skewing fairness assessments. To address this, we propose FairMatch, a post-processing method that applies propensity score matching to evaluate and mitigate bias. FairMatch identifies control and treatment pairs with similar propensity scores in the test set and adjusts decision thresholds for different subgroups accordingly. For samples that cannot be matched, we perform probabilistic calibration using fairness-aware loss functions. Experimental results demonstrate that our approach can (a) precisely locate subsets of the test data where the model is unbiased, and (b) significantly reduce bias on the remaining data. Overall, propensity score matching offers a principled way to improve both fairness evaluation and mitigation, without sacrificing predictive performance.
- Europe (0.14)
- North America > United States > North Carolina > Wake County > Raleigh (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Law (1.00)
- Health & Medicine (0.68)
- Education (0.68)
- (2 more...)
Exploring the Implementation of AI in Early Onset Interviews to Help Mitigate Bias
This paper investigates the application of artificial intelligence (AI) in early-stage recruitment interviews in order to reduce inherent bias, specifically sentiment bias. Traditional interviewers are often subject to several biases, including interviewer bias, social desirability effects, and even confirmation bias. In turn, this leads to non-inclusive hiring practices, and a less diverse workforce. This study further analyzes various AI interventions that are present in the marketplace today such as multimodal platforms and interactive candidate assessment tools in order to gauge the current market usage of AI in early-stage recruitment. However, this paper aims to use a unique AI system that was developed to transcribe and analyze interview dynamics, which emphasize skill and knowledge over emotional sentiments. Results indicate that AI effectively minimizes sentiment-driven biases by 41.2%, suggesting its revolutionizing power in companies' recruitment processes for improved equity and efficiency.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > New York (0.04)
- Research Report (1.00)
- Personal > Interview (0.46)
Trustworthy and Responsible AI for Human-Centric Autonomous Decision-Making Systems
Dehghani, Farzaneh, Dibaji, Mahsa, Anzum, Fahim, Dey, Lily, Basdemir, Alican, Bayat, Sayeh, Boucher, Jean-Christophe, Drew, Steve, Eaton, Sarah Elaine, Frayne, Richard, Ginde, Gouri, Harris, Ashley, Ioannou, Yani, Lebel, Catherine, Lysack, John, Arzuaga, Leslie Salgado, Stanley, Emma, Souza, Roberto, Santos, Ronnie de Souza, Wells, Lana, Williamson, Tyler, Wilms, Matthias, Wahid, Zaman, Ungrin, Mark, Gavrilova, Marina, Bento, Mariana
Artificial Intelligence (AI) represents the frontier of computer science, enabling machines to emulate human intelligence and perform tasks that were once exclusive to human capabilities (Briganti and Le Moine 2020). This rapid progression in AI, driven by Machine Learning (ML) and Deep Learning (DL) innovations, has catalyzed breakthroughs across various industries, including business, communication, healthcare, and education, among others. Utilizing state-of-the-art computational resources, the AI models are trained on extensive datasets and can be used for decision-making on unseen data. Recent advancements in AI algorithms and feature engineering techniques have played a pivotal role in transforming various human-centric fields, notably, healthcare (Esteva et al 2019), image and text generation (Epstein et al 2023), biometrics and cybersecurity (Gavrilova et al 2022), online social media opinion mining (Anzum and Gavrilova 2023), autonomous driving vehicles (Ma et al 2020), and beyond. Despite the impressive capabilities exhibited by recent AI-based systems, a significant challenge lies in their inherent black box nature. Due to the lack of explainability and interpretability of AI models, establishing trust among end users has become critical (von Eschenbach 2021). Therefore, to ensure trustworthiness in AI-empowered systems, it is imperative not only to improve the model's accuracy but also to incorporate explainability and interpretability into the model's architecture and
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.15)
- Europe (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Japan (0.04)
- Research Report > Experimental Study (1.00)
- Overview (0.92)
- Transportation (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (6 more...)
Mitigating Bias in Dataset Distillation
Cui, Justin, Wang, Ruochen, Xiong, Yuanhao, Hsieh, Cho-Jui
Dataset Distillation has emerged as a technique for compressing large datasets into smaller synthetic counterparts, facilitating downstream training tasks. In this paper, we study the impact of bias inside the original dataset on the performance of dataset distillation. With a comprehensive empirical evaluation on canonical datasets with color, corruption and background biases, we found that color and background biases in the original dataset will be amplified through the distillation process, resulting in a notable decline in the performance of models trained on the distilled dataset, while corruption bias is suppressed through the distillation process. To reduce bias amplification in dataset distillation, we introduce a simple yet highly effective approach based on a sample reweighting scheme utilizing kernel density estimation. Empirical results on multiple real-world and synthetic datasets demonstrate the effectiveness of the proposed method. Notably, on CMNIST with 5% bias-conflict ratio and IPC 50, our method achieves 91.5% test accuracy compared to 23.8% from vanilla DM, boosting the performance by 67.7%, whereas applying state-of-the-art debiasing method on the same dataset only achieves 53.7% accuracy. Our findings highlight the importance of addressing biases in dataset distillation and provide a promising avenue to address bias amplification in the process.
- Europe > Austria > Vienna (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
The Pursuit of Fairness in Artificial Intelligence Models: A Survey
Kheya, Tahsin Alamgir, Bouadjenek, Mohamed Reda, Aryal, Sunil
Artificial Intelligence (AI) models are now being utilized in all facets of our lives such as healthcare, education and employment. Since they are used in numerous sensitive environments and make decisions that can be life altering, potential biased outcomes are a pressing matter. Developers should ensure that such models don't manifest any unexpected discriminatory practices like partiality for certain genders, ethnicities or disabled people. With the ubiquitous dissemination of AI systems, researchers and practitioners are becoming more aware of unfair models and are bound to mitigate bias in them. Significant research has been conducted in addressing such issues to ensure models don't intentionally or unintentionally perpetuate bias. This survey offers a synopsis of the different ways researchers have promoted fairness in AI systems. We explore the different definitions of fairness existing in the current literature. We create a comprehensive taxonomy by categorizing different types of bias and investigate cases of biased AI in different application domains. A thorough study is conducted of the approaches and techniques employed by researchers to mitigate bias in AI models. Moreover, we also delve into the impact of biased models on user experience and the ethical considerations to contemplate when developing and deploying such models. We hope this survey helps researchers and practitioners understand the intricate details of fairness and bias in AI systems. By sharing this thorough survey, we aim to promote additional discourse in the domain of equitable and responsible AI.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > Victoria (0.04)
- (29 more...)
- Overview (1.00)
- Research Report > Promising Solution (0.45)
- Media (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Information Technology > Services (1.00)
- (7 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
CRISPR: Eliminating Bias Neurons from an Instruction-following Language Model
Yang, Nakyeong, Kang, Taegwan, Jung, Kyomin
Large language models (LLMs) executing tasks through instruction-based prompts often face challenges stemming from distribution differences between user instructions and training instructions. This leads to distractions and biases, especially when dealing with inconsistent dynamic labels. In this paper, we introduces a novel bias mitigation method, CRISPR, designed to alleviate instruction-label biases in LLMs. CRISPR utilizes attribution methods to identify bias neurons influencing biased outputs and employs pruning to eliminate the bias neurons. Experimental results demonstrate the method's effectiveness in mitigating biases in instruction-based prompting, enhancing language model performance on social bias benchmarks without compromising pre-existing knowledge. CRISPR proves highly practical, model-agnostic, offering flexibility in adapting to evolving social biases.
MLOps: A Primer for Policymakers on a New Frontier in Machine Learning
Jazmia Henry July 18, 2022 Summary Discussions about reducing the bias present in algorithms have been on the rise since the mid 2010s. AI ethicists, DEI practitioners, Sociologists, Data Scientists and Social Justice Advocates have decried the lack of understanding of the harms that algorithms pose to people who belong to historically marginalized groups. These cries have become increasingly accepted in industry since 2020, but little is understood of how algorithm and Machine Learning (ML) model builders should go about mitigating bias in models that are intended for deployment. This chapter is written with the Data Scientist or MLOps professional in mind but can be used as a resource for policy makers, reformists, AI Ethicists, sociologists, and others interested in finding methods that help reduce bias in algorithms. I will take a deployment centered approach with the assumption that the professionals reading this work have already read the amazing work on the implications of algorithms on historically marginalized groups by Gebru, Buolamwini, Benjamin and Shane to name a few. If you have not read those works, I refer you to the "Important Reading for Ethical Model Building " list at the end of this paper as it will help give you a framework on how to think about Machine Learning models more holistically taking into account their effect on marginalized people. In the Introduction to this chapter, I root the significance of their work in real world examples of what happens when models are deployed without transparent data collected for the training process and are deployed without the practitioners paying special attention to what happens to models that adapt to exploit gaps between their training environment and the real world. The rest of this chapter builds on the work of the aforementioned researchers and discusses the reality of models performing post production and details ways ML practitioners can identify bias using tools during the MLOps lifecycle to mitigate bias that may be introduced to models in the real world. Introduction "Whether AI will help us reach our aspirations or reinforce the unjust inequalities is ultimately up to us." - Joy Buolowini, 'Facing the Coded Gaze' AI: More than Human Whether you're driving your car using a GPS system, call on Alexa or Siri to turn on your favorite tune, go on social media to perform a well-earned scroll down memory lane, or go to Google search to find a gift to buy for a friend, you have encountered a Machine Learning model.
- Government (0.70)
- Health & Medicine (0.50)
- Education (0.49)
- (2 more...)
Towards Algorithmic Fairness in Space-Time: Filling in Black Holes
Flynn, Cheryl, Guha, Aritra, Majumdar, Subhabrata, Srivastava, Divesh, Zhou, Zhengyi
New technologies and the availability of geospatial data have drawn attention to spatio-temporal biases present in society. For example: the COVID-19 pandemic highlighted disparities in the availability of broadband service and its role in the digital divide; the environmental justice movement in the United States has raised awareness to health implications for minority populations stemming from historical redlining practices; and studies have found varying quality and coverage in the collection and sharing of open-source geospatial data. Despite the extensive literature on machine learning (ML) fairness, few algorithmic strategies have been proposed to mitigate such biases. In this paper we highlight the unique challenges for quantifying and addressing spatio-temporal biases, through the lens of use cases presented in the scientific literature and media. We envision a roadmap of ML strategies that need to be developed or adapted to quantify and overcome these challenges -- including transfer learning, active learning, and reinforcement learning techniques. Further, we discuss the potential role of ML in providing guidance to policy makers on issues related to spatial fairness.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- South America > Brazil (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (3 more...)
- Government (1.00)
- Health & Medicine > Therapeutic Area (0.69)
- Law > Civil Rights & Constitutional Law (0.68)
- Transportation > Ground > Road (0.47)