Metric-Fair Prompting: Treating Similar Samples Similarly
Wang, Jing, Shen, Jie, Niu, Xing, Zhang, Tong, Weiss, Jeremy
–arXiv.org Artificial Intelligence
We introduce \emph{Metric-Fair Prompting}, a fairness-aware prompting framework that guides large language models (LLMs) to make decisions under metric-fairness constraints. In the application of multiple-choice medical question answering, each {(question, option)} pair is treated as a binary instance with label $+1$ (correct) or $-1$ (incorrect). To promote {individual fairness}~--~treating similar instances similarly~--~we compute question similarity using NLP embeddings and solve items in \emph{joint pairs of similar questions} rather than in isolation. The prompt enforces a global decision protocol: extract decisive clinical features, map each \((\text{question}, \text{option})\) to a score $f(x)$ that acts as confidence, and impose a Lipschitz-style constraint so that similar inputs receive similar scores and, hence, consistent outputs. Evaluated on the {MedQA (US)} benchmark, Metric-Fair Prompting is shown to improve performance over standard single-item prompting, demonstrating that fairness-guided, confidence-oriented reasoning can enhance LLM accuracy on high-stakes clinical multiple-choice questions.
arXiv.org Artificial Intelligence
Dec-9-2025
- Country:
- North America > United States
- California (0.04)
- Illinois > Champaign County
- Urbana (0.04)
- North America > United States
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education (1.00)
- Government > Regional Government
- Health & Medicine > Therapeutic Area (1.00)
- Technology: