Goto

Collaborating Authors

Xiong

AAAI Conferences

Coherence that ties sentences of a text into a meaningfully connected structure is of great importance to text generation and translation. In this paper, we propose a topic-based coherence model to produce coherence for document translation, in terms of the continuity of sentence topics in a text. We automatically extract a coherence chain for each source text to be translated. Based on the extracted source coherence chain, we adopt a maximum entropy classifier to predict the target coherence chain that defines a linear topic structure for the target document. The proposed topic-based coherence model then uses the predicted target coherence chain to help decoder select coherent word/phrase translations. Our experiments show that incorporating the topic-based coherence model into machine translation achieves substantial improvement over both the baseline and previous methods that integrate document topics rather than coherence chains into machine translation.



Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input

arXiv.org Artificial Intelligence

We demonstrate that current state-of-the-art approaches to Automated Essay Scoring (AES) are not well-suited to capturing adversarially crafted input of grammatical but incoherent sequences of sentences. We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging adversarial input, further contributing to the development of an approach that strengthens the validity of neural essay scoring models.


A Topic-Based Coherence Model for Statistical Machine Translation

AAAI Conferences

Coherence that ties sentences of a text into a meaningfully connected structure is of great importance to text generation and translation. In this paper, we propose a topic-based coherence model to produce coherence for document translation, in terms of the continuity of sentence topics in a text. We automatically extract a coherence chain for each source text to be translated. Based on the extracted source coherence chain, we adopt a maximum entropy classifier to predict the target coherence chain that defines a linear topic structure for the target document. The proposed topic-based coherence model then uses the predicted target coherence chain to help decoder select coherent word/phrase translations. Our experiments show that incorporating the topic-based coherence model into machine translation achieves substantial improvement over both the baseline and previous methods that integrate document topics rather than coherence chains into machine translation.


Automatic Coherence Profile in Public Speeches of Three Latin American Heads-of-State

AAAI Conferences

Different studies provide evidence that the computational psycholinguistic algorithm called Latent Semantic Analysis (LSA) allows measuring local and global coherence in texts similarly to human evaluation (Foltz, Kintsch, Landauer 1998; McNamara, Cai & Louwerse 2007; McCarthy, Briner, Rus, & McNamara, 2007; McNamara, Louwerse & Jeuniaux 2009; Louwerse, McCarthy & Graesser 2010). The texts used in all these studies are written in English and correspond to scientific and literary texts. In Spanish, there are some studies using LSA that measure the semantic similarity between texts in automatic summary assessment (Pérez, Alfonseca, Rodríguez, Gliozzo, Strapparava & Magnini 2005; León, Olmos, Escudero, Cañas & Salmerón 2006; Venegas 2007, 2009, 2011); however, automatic measurement of coherence in Spanish has not yet been sufficiently investigated. The present study aimed at identifying a global and local coherence profile in a corpus of speeches in Spanish of three Latin American Heads-of-States (Perón, Castro and Pinochet), using Latent Semantic Analysis. Local coherence is calculated through the measurement of implicit semantic similarity between adjacent sentences and global coherence through the measurement of the similarity among the semantic content of the paragraphs. The corpus under analysis corresponds to a sample of 107 speeches. The semantic space was built using a multi-register corpus and it is available through the “Interface for the measurement of lexical-semantic similarity” in the El Grial interface (www.elgrial.cl). Results showed a systematic difference between the speeches of the Heads-of-State in terms of both local and global coherence. The Bonferroni analysis established an effect that distinguishes Perón’s speeches from Pinochet’s and Castro’s speeches. This results show that Perón’s speeches are more topically related than the other leaders’, probably due to a discourse strategy to persuade voters. The identification of a profile of coherence might be relevant to predict cues of government discourse styles.