Optimizing text representations to capture (dis)similarity between political parties

Ceron, Tanise, Blokker, Nico, Padó, Sebastian

Oct-21-2022–arXiv.org Artificial Intelligence

Even though fine-tuned neural language models have been pivotal in enabling "deep" automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political parties. Our research question is what level of structural information is necessary to create robust text representation, contrasting a strongly informed approach (which uses both claim span and claim category annotations) with approaches that forgo one or both types of annotation with document structure-based heuristics. Evaluating our models on the manifestos of German parties for the 2021 federal election. We find that heuristics that maximize within-party over between-party similarity along with a normalization step lead to reliable party similarity prediction, without the need for manual annotation.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-21-2022

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America > United States
  - New York (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
- Europe
  - United Kingdom > England
    - Oxfordshire > Oxford (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Middle East > Republic of Türkiye
    - Istanbul Province > Istanbul (0.04)
  - Germany
    - Bremen > Bremen (0.28)
    - Baden-Württemberg > Stuttgart Region
      - Stuttgart (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East > Republic of Türkiye
    - Istanbul Province > Istanbul (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Government
  - Voting & Elections (1.00)
  - Regional Government > North America Government
    - United States Government (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Text Processing (0.88)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found