Censorship of Online Encyclopedias: Implications for NLP Models

Jan-22-2021–arXiv.org Artificial Intelligence

NLP impacts how firms provide products to users, content individuals receive through search and social media, and how While artificial intelligence provides the backbone for many tools individuals interact with news and emails. Despite the growing people use around the world, recent work has brought to attention importance of NLP algorithms in shaping our lives, recently scholars, that the algorithms powering AI are not free of politics, stereotypes, policymakers, and the business community have raised the and bias. While most work in this area has focused on the ways alarm of how gender and racial biases may be baked into these algorithms. in which AI can exacerbate existing inequalities and discrimination, Because they are trained on human data, the algorithms very little work has studied how governments actively shape themselves can replicate implicit and explicit human biases and training data. We describe how censorship has affected the development aggravate discrimination [6, 8, 39]. Additionally, training data that of Wikipedia corpuses, text data which are regularly used over-represents a subset of the population may do a worse job for pre-trained inputs into NLP algorithms. We show that word embeddings at predicting outcomes for other groups in the population [13].

baidu baike, category, wikipedia, (14 more...)

arXiv.org Artificial Intelligence

Jan-22-2021

arXiv.org PDF

Add feedback

Country:
- Europe > Russia (0.04)
- North America
  - Canada (0.05)
  - United States
    - New York > New York County
      - New York City (0.04)
    - California > San Diego County
      - La Jolla (0.14)
      - San Diego (0.04)
- Asia
  - Taiwan (0.04)
  - Uzbekistan (0.04)
  - Russia (0.04)
  - Macao (0.04)
  - Middle East
    - Republic of Türkiye (0.04)
    - Iran (0.04)
  - China
    - Hong Kong (0.04)
    - Tibet Autonomous Region (0.04)
    - Jiangxi Province > Nanchang (0.04)
    - Guangdong Province > Guangzhou (0.04)
- Africa > Eswatini
  - Manzini > Manzini (0.04)

Genre:
- Research Report (1.00)

Industry:
- Media > News (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- Government (1.00)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Natural Language (1.00)
    - Machine Learning > Performance Analysis
      - Accuracy (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found