A Bilingual Analysis of Cohesion in a Corpus of Leader Speeches

AAAI Conferences

We study in this paper the cohesion of a leader's speeches over time. This is part of a larger project that aims at investigating the language of leaders and how their language changes over their stay in power. Here, we analyze the speeches of a leader who stayed in power for a long period of time, i.e. more than 30 years. We measure cohesion of speeches in the original language, which is Arabic in our case, as well as in English, based on human translations of the original speeches. The cohesion is measured in two different ways: using word overlap and Latent Semantic Analysis. Because of the morphological complexity of Arabic, the word overlap measure of cohesion becomes challenging in Arabic. Latent Semantic Analysis, which is totally unsupervised, is applied similarly for Arabic and English. The results show that cohesion has a general down trend over time and that during and after major crises the leader's speeches exhibit an increas in cohesion which can be explained as an attempt on leader's behalf to make his policies more clear, most likely as a form of post-crisis management.


Venegas

AAAI Conferences

Different studies provide evidence that the computational psycholinguistic algorithm called Latent Semantic Analysis (LSA) allows measuring local and global coherence in texts similarly to human evaluation (Foltz, Kintsch, Landauer 1998; McNamara, Cai & Louwerse 2007; McCarthy, Briner, Rus, & McNamara, 2007; McNamara, Louwerse & Jeuniaux 2009; Louwerse, McCarthy & Graesser 2010). The texts used in all these studies are written in English and correspond to scientific and literary texts. In Spanish, there are some studies using LSA that measure the semantic similarity between texts in automatic summary assessment (Pérez, Alfonseca, Rodríguez, Gliozzo, Strapparava & Magnini 2005; León, Olmos, Escudero, Cañas & Salmerón 2006; Venegas 2007, 2009, 2011); however, automatic measurement of coherence in Spanish has not yet been sufficiently investigated. The present study aimed at identifying a global and local coherence profile in a corpus of speeches in Spanish of three Latin American Heads-of-States (Perón, Castro and Pinochet), using Latent Semantic Analysis. Local coherence is calculated through the measurement of implicit semantic similarity between adjacent sentences and global coherence through the measurement of the similarity among the semantic content of the paragraphs.


Automatic Coherence Profile in Public Speeches of Three Latin American Heads-of-State

AAAI Conferences

Different studies provide evidence that the computational psycholinguistic algorithm called Latent Semantic Analysis (LSA) allows measuring local and global coherence in texts similarly to human evaluation (Foltz, Kintsch, Landauer 1998; McNamara, Cai & Louwerse 2007; McCarthy, Briner, Rus, & McNamara, 2007; McNamara, Louwerse & Jeuniaux 2009; Louwerse, McCarthy & Graesser 2010). The texts used in all these studies are written in English and correspond to scientific and literary texts. In Spanish, there are some studies using LSA that measure the semantic similarity between texts in automatic summary assessment (Pérez, Alfonseca, Rodríguez, Gliozzo, Strapparava & Magnini 2005; León, Olmos, Escudero, Cañas & Salmerón 2006; Venegas 2007, 2009, 2011); however, automatic measurement of coherence in Spanish has not yet been sufficiently investigated. The present study aimed at identifying a global and local coherence profile in a corpus of speeches in Spanish of three Latin American Heads-of-States (Perón, Castro and Pinochet), using Latent Semantic Analysis. Local coherence is calculated through the measurement of implicit semantic similarity between adjacent sentences and global coherence through the measurement of the similarity among the semantic content of the paragraphs. The corpus under analysis corresponds to a sample of 107 speeches. The semantic space was built using a multi-register corpus and it is available through the “Interface for the measurement of lexical-semantic similarity” in the El Grial interface (www.elgrial.cl). Results showed a systematic difference between the speeches of the Heads-of-State in terms of both local and global coherence. The Bonferroni analysis established an effect that distinguishes Perón’s speeches from Pinochet’s and Castro’s speeches. This results show that Perón’s speeches are more topically related than the other leaders’, probably due to a discourse strategy to persuade voters. The identification of a profile of coherence might be relevant to predict cues of government discourse styles.


How an AI Algorithm Learned to Write Political Speeches

#artificialintelligence

"Ask not what your country can do for you; ask what you can do for your country." When it comes to political speeches, great ones are few and far between. But ordinary political speeches, those given in U.S. congressional floor debates, for example, are numerous. They are also remarkably similar. These speeches tend to follow a standard format, repeat similar arguments, and even use the same phrases to indicate a particular political affiliation or opinion.


A Dataset of General-Purpose Rebuttal

arXiv.org Artificial Intelligence

In Natural Language Understanding, the task of response generation is usually focused on responses to short texts, such as tweets or a turn in a dialog. Here we present a novel task of producing a critical response to a long argumentative text, and suggest a method based on general rebuttal arguments to address it. We do this in the context of the recently-suggested task of listening comprehension over argumentative content: given a speech on some specified topic, and a list of relevant arguments, the goal is to determine which of the arguments appear in the speech. The general rebuttals we describe here (written in English) overcome the need for topic-specific arguments to be provided, by proving to be applicable for a large set of topics. This allows creating responses beyond the scope of topics for which specific arguments are available. All data collected during this work is freely available for research.