A Greek Parliament Proceedings Dataset for Computational Linguistics and Political Analysis

Jan-18-2025, 16:55:39 GMT–Neural Information Processing Systems

Large, diachronic datasets of political discourse are hard to come across, especially for resource-lean languages such as Greek. In this paper, we introduce a curated dataset of the Greek Parliament Proceedings that extends chronologically from 1989 up to 2020. It consists of more than 1 million speeches with extensive meta-data, extracted from 5,355 parliamentary sitting record files. We explain how it was constructed and the challenges that had to be overcome. The dataset can be used for both computational linguistics and political analysis---ideally, combining the two.

computational linguistic and political analysis, greek parliament proceedings dataset

Neural Information Processing Systems

Jan-18-2025, 16:55:39 GMT

Conferences Web Page

Add feedback

Industry:
- Government (0.46)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (0.66)