A Linguistic Analysis of Student-Generated Paraphrases

Rus, Vasile (The University of Memphis) | Feng, Shi (The University of Memphis) | Brandon, Russell (The University of Memphis) | Crossley, Scott (Georgia State University) | McNamara, Danielle S. (The University of Memphis)

May-18-2011–AAAI Conferences

Paraphrase identification is a core Natural Language Processing task that involves assessing the semantic similarity of two texts. To foster systematic studies of this task, standardized datasets were created on which various approaches could be compared more fairly. However, a better understanding and more precise operational definition of a paraphrase are needed before any further datasets or systematic evaluations of the task of paraphrase identification are proposed. This study develops the concept of paraphrasing as a writing strategy. Six types of paraphrases are defined through the creation of a relatively large corpus of student-generated paraphrases. These paraphrases are analyzed along several dozen linguistic dimensions ranging from cohesion to lexical diversity. The most significant indices from these dimensions were then used to build a prediction model that could identify true and false paraphrases and each of the six paraphrase types.

original passage, original text, student, (17 more...)

AAAI Conferences

May-18-2011

Conferences PDF

Add feedback

Country:
- North America > United States
  - Missouri (0.04)
  - Mississippi (0.04)
  - Tennessee > Shelby County
    - Memphis (0.04)
  - New Jersey > Bergen County
    - Mahwah (0.04)
  - Michigan > Washtenaw County
    - Ann Arbor (0.04)
  - Florida > Volusia County
    - Daytona Beach (0.04)
- Europe > Switzerland
  - Geneva > Geneva (0.04)

Genre:
- Research Report > New Finding (0.48)

Industry:
- Education (0.48)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found