NBIAS: A Natural Language Processing Framework for Bias Identification in Text

Raza, Shaina, Garg, Muskan, Reji, Deepak John, Bashir, Syed Raza, Ding, Chen

Aug-29-2023–arXiv.org Artificial Intelligence

Bias in textual data can lead to skewed interpretations and outcomes when the data is used. These biases could perpetuate stereotypes, discrimination, or other forms of unfair treatment. An algorithm trained on biased data may end up making decisions that disproportionately impact a certain group of people. Therefore, it is crucial to detect and remove these biases to ensure the fair and ethical use of data. To this end, we develop a comprehensive and robust framework NBIAS that consists of four main layers: data, corpus construction, model development and an evaluation layer. The dataset is constructed by collecting diverse data from various domains, including social media, healthcare, and job hiring portals. As such, we applied a transformer-based token classification model that is able to identify bias words/ phrases through a unique named entity BIAS. In the evaluation procedure, we incorporate a blend of quantitative and qualitative measures to gauge the effectiveness of our models. We achieve accuracy improvements ranging from 1% to 8% compared to baselines. We are also able to generate a robust understanding of the model functioning. The proposed approach is applicable to a variety of biases and contributes to the fair and ethical use of textual data.

annotation, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Aug-29-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Minnesota > Olmsted County
      - Rochester (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Spain (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Germany > Hesse
    - Darmstadt Region > Darmstadt (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
- Asia > India
  - Karnataka > Bengaluru (0.04)
- Africa > Eswatini
  - Manzini > Manzini (0.04)

Genre:
- Research Report
  - New Finding (1.00)
  - Promising Solution (0.67)

Industry:
- Government (1.00)
- Media > News (0.67)
- Health & Medicine > Therapeutic Area (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning
    - Performance Analysis > Accuracy (1.00)
    - Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found