Goto

Collaborating Authors

 Taboada, Maite


Dimensions of Online Conflict: Towards Modeling Agonism

arXiv.org Artificial Intelligence

Agonism plays a vital role in democratic dialogue by fostering diverse perspectives and robust discussions. Within the realm of online conflict there is another type: hateful antagonism, which undermines constructive dialogue. Detecting conflict online is central to platform moderation and monetization. It is also vital for democratic dialogue, but only when it takes the form of agonism. To model these two types of conflict, we collected Twitter conversations related to trending controversial topics. We introduce a comprehensive annotation schema for labelling different dimensions of conflict in the conversations, such as the source of conflict, the target, and the rhetorical strategies deployed. Using this schema, we annotated approximately 4,000 conversations with multiple labels. We then trained both logistic regression and transformer-based models on the dataset, incorporating context from the conversation, including the number of participants and the structure of the interactions. Results show that contextual labels are helpful in identifying conflict and make the models robust to variations in topic. Our research contributes a conceptualization of different dimensions of conflict, a richly annotated dataset, and promising results that can contribute to content moderation.


Radar de Parit\'e: An NLP system to measure gender representation in French news stories

arXiv.org Artificial Intelligence

We present the Radar de Parité, an automated Natural Language Processing (NLP) system that measures the proportion of women and men quoted daily in six Canadian French-language media outlets. We outline the system's architecture and detail the challenges we overcame to address French-specific issues, in particular regarding coreference resolution, a new contribution to the NLP literature on French. Our results highlight the underrepresentation of women in news stories, while also illustrating the application of modern NLP methods to measure gender representation and address societal issues. The commonality in most applied NLP research projects is the need to reliably and scalably extract information from unstructured text data. In this paper, we describe one such application: extracting quotes from news stories to quantify gender representation. Gender representation in the media is a long debated topic. From the 1970s, there have been studies into how much women and gender-diverse people are portrayed in news stories, with the general hypothesis that they tend to be underrepresented [1, 2]. There is also research studying how they are represented, i.e., whether sexist or homophobic tropes are present when we discuss women and gender-diverse people [3, 4]. In this work, we tackle one specific aspect of representation: who is quoted and in what proportions. Our starting hypothesis is that we hear less from women than from men in news stories, that is, that men are quoted more often than is to be expected from their proportion in the general population. To fully answer this question, we formulate a quantitative approach, collecting large amounts of representative data and extracting quotes from the unstructured text. This is the goal of the Radar de Parité. We define quotes as either direct or indirect reproductions of what a person said, and we define that person as a source in news articles. In order to extract quotes, we employ a full NLP pipeline, focusing on parsing to identify speakers, verbs, and quotes, in each news story. We then predict the gender of the speaker (or source), using external genderprediction services.