Discovering Differences in the Representation of People using Contextualized Semantic Axes

Lucy, Li, Tadimeti, Divya, Bamman, David

Oct-21-2022–arXiv.org Artificial Intelligence

A common paradigm for identifying semantic differences across social and temporal contexts is the use of static word embeddings and their distances. In particular, past work has compared embeddings against "semantic axes" that represent two opposing concepts. We extend this paradigm to BERT embeddings, and construct contextualized axes that mitigate the pitfall where antonyms have neighboring representations. We validate and demonstrate these axes on two people-centric datasets: occupations from Wikipedia, and multi-platform discussions in extremist, men's communities over fourteen years. In both studies, contextualized semantic axes can characterize differences among instances of the same word type. In the latter study, we show that references to women and the contexts around them have become more detestable over time.

computational linguistic, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

Oct-21-2022

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - Dominican Republic (0.04)
  - Canada (0.04)
  - United States
    - Pennsylvania (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - New York > New York County
      - New York City (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - California
      - San Diego County > San Diego (0.04)
      - Alameda County > Berkeley (0.04)
- Europe
  - Ireland (0.04)
  - Germany (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East > Qatar
    - Ad-Dawhah > Doha (0.04)

Genre:
- Research Report (1.00)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language > Text Processing (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found