Navigating the Ocean of Biases: Political Bias Attribution in Language Models via Causal Structures
Jenny, David F., Billeter, Yann, Sachan, Mrinmaya, Schölkopf, Bernhard, Jin, Zhijing
–arXiv.org Artificial Intelligence
The rapid advancement of Large Language Models (LLMs) has sparked intense debate regarding their ability to perceive and interpret complex socio-political landscapes. In this study, we undertake an exploration of decisionmaking processes and inherent biases within Figure 1: (Undesired) Effect of Bias Treatment on Decision LLMs, exemplified by ChatGPT, specifically Process: The figure depicts how the LLM's perception contextualizing our analysis within political debates. of value A is considered during the decision We aim not to critique or validate LLMs' process while judging B and C through f(C|A) and values, but rather to discern how they interpret f(B|A). When treating the biased association of value and adjudicate "good arguments." By applying A with C (f(C|A)) by naively fine-tuning the model to Activity Dependency Networks (ADNs), align with this value of interest, other value associations we extract the LLMs' implicit criteria for such (f(B|A)), that are not actively considered. They may assessments and illustrate how normative values be changed indiscriminately, regardless of whether they influence these perceptions. We discuss were already aligned. These associations are currently the consequences of our findings for human-AI neither observable nor predictable yet changes in them alignment and bias mitigation.
arXiv.org Artificial Intelligence
Nov-14-2023
- Country:
- Africa > Middle East
- Libya (0.04)
- Asia
- China (0.04)
- Middle East
- Russia (0.04)
- Europe
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Germany > Baden-Württemberg
- Tübingen Region > Tübingen (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Middle East (0.05)
- Russia (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Western Europe (0.04)
- Croatia > Dubrovnik-Neretva County
- North America
- Canada > Ontario
- Toronto (0.04)
- United States > Washington
- King County > Seattle (0.04)
- Canada > Ontario
- Africa > Middle East
- Genre:
- Research Report > New Finding (0.86)
- Industry:
- Technology: