fairness function
Towards Reward Fairness in RLHF: From a Resource Allocation Perspective
Ouyang, Sheng, Hu, Yulan, Chen, Ge, Li, Qingyang, Zhang, Fuzheng, Liu, Yong
Rewards serve as proxies for human preferences and play a crucial role in Reinforcement Learning from Human Feedback (RLHF). However, if these rewards are inherently imperfect, exhibiting various biases, they can adversely affect the alignment of large language models (LLMs). In this paper, we collectively define the various biases present in rewards as the problem of reward unfairness. We propose a bias-agnostic method to address the issue of reward fairness from a resource allocation perspective, without specifically designing for each type of bias, yet effectively mitigating them. Specifically, we model preference learning as a resource allocation problem, treating rewards as resources to be allocated while considering the trade-off between utility and fairness in their distribution. We propose two methods, Fairness Regularization and Fairness Coefficient, to achieve fairness in rewards. We apply our methods in both verification and reinforcement learning scenarios to obtain a fairness reward model and a policy model, respectively. Experiments conducted in these scenarios demonstrate that our approach aligns LLMs with human preferences in a more fair manner.
DECAF: Learning to be Fair in Multi-agent Resource Allocation
A wide variety of resource allocation problems operate under resource constraints that are managed by a central arbitrator, with agents who evaluate and communicate preferences over these resources. We formulate this broad class of problems as Distributed Evaluation, Centralized Allocation (DECA) problems and propose methods to learn fair and efficient policies in centralized resource allocation. Our methods are applied to learning long-term fairness in a novel and general framework for fairness in multi-agent systems. We show three different methods based on Double Deep Q-Learning: (1) A joint weighted optimization of fairness and utility, (2) a split optimization, learning two separate Q-estimators for utility and fairness, and (3) an online policy perturbation to guide existing black-box utility functions toward fair solutions. Our methods outperform existing fair MARL approaches on multiple resource allocation domains, even when evaluated using diverse fairness functions, and allow for flexible online trade-offs between utility and fairness.
Remembering to Be Fair: On Non-Markovian Fairness in Sequential Decision Making (Preliminary Report)
Alamdari, Parand A., Klassen, Toryn Q., Creager, Elliot, McIlraith, Sheila A.
Fair decision making has largely been studied with respect to a single decision. In this paper we investigate the notion of fairness in the context of sequential decision making where multiple stakeholders can be affected by the outcomes of decisions, and where decision making may be informed by additional constraints and criteria beyond the requirement of fairness. In this setting, we observe that fairness often depends on the history of the sequential decision-making process and not just on the current state. To advance our understanding of this class of fairness problems, we define the notion of non-Markovian fairness in the context of sequential decision making. We identify properties of non-Markovian fairness, including notions of long-term, anytime, periodic, and bounded fairness. We further explore the interplay between non-Markovian fairness and memory, and how this can support construction of fair policies in sequential decision-making settings.
- North America > Canada > Ontario > Toronto (0.30)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Fair and Robust Estimation of Heterogeneous Treatment Effects for Policy Learning
Kim, Kwangho, Zubizarreta, José R.
We propose a simple and general framework for nonparametric estimation of heterogeneous treatment effects under fairness constraints. Under standard regularity conditions, we show that the resulting estimators possess the double robustness property. We use this framework to characterize the trade-off between fairness and the maximum welfare achievable by the optimal policy. We evaluate the methods in a simulation study and illustrate them in a real-world case study.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)