A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models

Zhou, Yi, Camacho-Collados, Jose, Bollegala, Danushka

Oct-22-2023–arXiv.org Artificial Intelligence

Various types of social biases have been reported with pretrained Masked Language Models (MLMs) in prior work. However, multiple underlying factors are associated with an MLM such as its model size, size of the training data, training objectives, the domain from which pretraining data is sampled, tokenization, and languages present in the pretrained corpora, to name a few. It remains unclear as to which of those factors influence social biases that are learned by MLMs. To study the relationship between model factors and the social biases learned by an MLM, as well as the downstream task performance of the model, we conduct a comprehensive study over 39 pretrained MLMs covering different model sizes, training objectives, tokenization methods, training data domains and languages. Our results shed light on important factors often neglected in prior literature, such as tokenization or model objectives.

computational linguistic, mlm, proceedings, (14 more...)

arXiv.org Artificial Intelligence

Oct-22-2023

arXiv.org PDF

Add feedback

Country:
- Asia (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Spain (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Health & Medicine (0.46)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (1.00)
    - Natural Language
      - Text Processing (0.68)
      - Machine Translation (0.68)
      - Large Language Model (0.46)
    - Machine Learning
      - Statistical Learning (0.69)
      - Neural Networks (0.68)
      - Decision Tree Learning (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found