Computational Replication of Human Paraphrase Assessment
McCarthy, Philip Michael (The University of Memphis) | Cai, Zhigiang (The University of Memphis) | McNamara, Danielle S. (The University of Memphis)
Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.
May-21-2009
- Country:
- North America > United States
- Tennessee > Shelby County
- Memphis (0.04)
- New Jersey > Bergen County
- Mahwah (0.05)
- Massachusetts > Plymouth County
- Norwell (0.04)
- California
- San Francisco County > San Francisco (0.14)
- San Mateo County > Menlo Park (0.04)
- Tennessee > Shelby County
- Europe > Switzerland
- Asia > Japan
- Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Education > Educational Setting > K-12 Education > Secondary School (0.54)
- Technology: