Review for NeurIPS paper: Investigating Gender Bias in Language Models Using Causal Mediation Analysis

Jan-26-2025, 14:16:53 GMT–Neural Information Processing Systems

Only the reporting clause is examined while the that clause that contains the statement is ignored: In previous bias probing studies, the input content is the entire sentence with the complete context. However, in this paper, only the prompt part (reporting clause) is fed to the language model for analysis. Therefore, the proposed intervention setup effectively only focuses on word level bias probing. In the templates shown in Figure 8 in the Appendix, the verb "cry" or "drive" could embody implicit bias. However, under the current framework, such potential biases are not investigated. Therefore, the conclusions drawn in this study that gender bias effects are concentrated in specific components of the model may not generalize well when more complex syntactic and semantic structures and interactions are considered.

causal mediation analysis, language model, neurips paper, (3 more...)

Neural Information Processing Systems

Jan-26-2025, 14:16:53 GMT

Conferences Web Page

Add feedback

Industry:
- Law > Alternative Dispute Resolution (0.40)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (1.00)