Impact Of Missing Data Imputation On The Fairness And Accuracy Of Graph Node Classifiers

Mansoor, Haris, Ali, Sarwan, Alam, Shafiq, Khan, Muhammad Asad, Hassan, Umair ul, Khan, Imdadullah

Nov-1-2022–arXiv.org Artificial Intelligence

Analysis of the fairness of machine learning (ML) algorithms recently attracted many researchers' interest. Most ML methods show bias toward protected groups, which limits the applicability of ML models in many applications like crime rate prediction etc. Since the data may have missing values which, if not appropriately handled, are known to further harmfully affect fairness. Many imputation methods are proposed to deal with missing data. However, the effect of missing data imputation on fairness is not studied well. In this paper, we analyze the effect on fairness in the context of graph data (node attributes) imputation using different embedding and neural network methods. Extensive experiments on six datasets demonstrate severe fairness issues in missing data imputation under graph node classification. We also find that the choice of the imputation method affects both fairness and accuracy. Our results provide valuable insights into graph data fairness and how to handle missingness in graphs efficiently. This work also provides directions regarding theoretical studies on fairness in graph data.

artificial intelligence, data quality, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Nov-1-2022

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.04)
- Oceania > New Zealand
  - North Island > Auckland Region > Auckland (0.04)
- Europe
  - Slovakia (0.04)
  - Ireland > Connaught
    - County Galway > Galway (0.04)
- Asia > Pakistan
  - Punjab > Lahore Division > Lahore (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.66)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:
- Information Technology
  - Data Science > Data Quality (1.00)
  - Artificial Intelligence > Machine Learning
    - Statistical Learning (0.96)
    - Neural Networks (0.89)
    - Performance Analysis > Accuracy (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found