fairword
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise general performance on normal tasks. To address these limitations, we propose FaIRMaker, an automated and model-independent framework that employs an auto-search and refinement paradigm to adaptively generate Fairwords, which act as instructions to reduce gender bias and enhance response quality. FaIRMaker enhances the debiasing capacity by enlarging the Fairwords search space while preserving the utility and making it applicable to closed-source models by training a sequence-to-sequence model that adaptively refines Fairwords into effective debiasing instructions when facing gender-related queries and performance-boosting prompts for neutral inputs. Extensive experiments demonstrate that FaIRMaker effectively mitigates gender bias while preserving task integrity and ensuring compatibility with both open-and closed-source LLMs.
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise general performance on normal tasks. To address these limitations, we propose $\textit{FaIRMaker}$, an automated and model-independent framework that employs an $\textbf{auto-search and refinement}$ paradigm to adaptively generate Fairwords, which act as instructions to reduce gender bias and enhance response quality.
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
Xu, Yue, Fu, Chengyan, Xiong, Li, Yang, Sibei, Wang, Wenjie
Pre-training large language models (LLMs) on vast text corpora enhances natural language processing capabilities but risks encoding social biases, particularly gender bias. While parameter-modification methods like fine-tuning mitigate bias, they are resource-intensive, unsuitable for closed-source models, and lack adaptability to evolving societal norms. Instruction-based approaches offer flexibility but often compromise task performance. To address these limitations, we propose $\textit{FaIRMaker}$, an automated and model-independent framework that employs an $\textbf{auto-search and refinement}$ paradigm to adaptively generate Fairwords, which act as instructions integrated into input queries to reduce gender bias and enhance response quality. Extensive experiments demonstrate that $\textit{FaIRMaker}$ automatically searches for and dynamically refines Fairwords, effectively mitigating gender bias while preserving task integrity and ensuring compatibility with both API-based and open-source LLMs.
Fairwords claims to prevent workplace harassment with AI, but the reality is more complicated
Did you miss a session at the Data Summit? Harassment in the workplace affects employees of all backgrounds, genders, sexualities, and ethnicities -- but disproportionately those in under-represented groups. A 2018 survey by Stop Street Harassment showed that 81% of women have been harassed in their lifetime. And according to a UCLA School of Law study, half of LGBTQ workers have faced job discrimination at some point in their careers. Work-from-home arrangements during the pandemic haven't slowed or reversed the trend -- in fact, they've accelerated it.