Group Fairness Meets the Black Box: Enabling Fair Algorithms on Closed LLMs via Post-Processing
Xian, Ruicheng, Wan, Yuxuan, Zhao, Han
–arXiv.org Artificial Intelligence
Instruction fine-tuned large language models (LLMs) enable a simple zero-shot or few-shot prompting paradigm, also known as in-context learning, for building prediction models. This convenience, combined with continued advances in LLM capability, has the potential to drive their adoption across a broad range of domains, including high-stakes applications where group fairness -- preventing disparate impacts across demographic groups -- is essential. The majority of existing approaches to enforcing group fairness on LLM-based classifiers rely on traditional fair algorithms applied via model fine-tuning or head-tuning on final-layer embeddings, but they are no longer applicable to closed-weight LLMs under the in-context learning setting, which include some of the most capable commercial models today, such as GPT-4, Gemini, and Claude. In this paper, we propose a framework for deriving fair classifiers from closed-weight LLMs via prompting: the LLM is treated as a feature extractor, and features are elicited from its probabilistic predictions (e.g., token log probabilities) using prompts strategically designed for the specified fairness criterion to obtain sufficient statistics for fair classification; a fair algorithm is then applied to these features to train a lightweight fair classifier in a post-hoc manner. Experiments on five datasets, including three tabular ones, demonstrate strong accuracy-fairness tradeoffs for the classifiers derived by our framework from both open-weight and closed-weight LLMs; in particular, our framework is data-efficient and outperforms fair classifiers trained on LLM embeddings (i.e., head-tuning) or from scratch on raw tabular features.
arXiv.org Artificial Intelligence
Aug-18-2025
- Country:
- Europe > France (0.04)
- North America > United States
- Alaska (0.04)
- California (0.04)
- Illinois > Champaign County
- Urbana (0.40)
- New York (0.04)
- South Carolina (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Government (0.46)
- Health & Medicine (0.67)
- Law (0.67)
- Transportation > Air (0.40)
- Technology: