Education
Measuring Hint Level in Open Cloze Questions
Pino, Juan (Carnegie Mellon University) | Eskenazi, Maxine (Carnegie Mellon University)
Providing the first few letters of a missing word in a sentence gives information about this word. This paper attempts to measure the information transmitted in that case. In order to do so, we analyzed response accuracy for open cloze questions, that is fill-in-the-blank questions without multiple choice answers. In this study, native and non-native speakers of English answered a series of open cloze questions that were semi-automatically generated. Hints were provided that consisted of the first few letters of the missing word. Results showed that question difficulty, hence the quantity of information transmitted, is related to the number of letters that are provided, to physical properties of these letters and to syllables formed by these letters. Performances did not appear to depend on letter or syllable frequency. Controlling hint level in a word completion task is critical in order to provide practice exercises adapted to student levels.
Promoting Reflection and its Effect on Learning in a Programming Tutor
Kumar, Amruth N. (Ramapo College of New Jersey)
We studied the effect of post-practice reflection on learning, using programming tutors, and multiple-choice format for reflection. We conducted in-vivo controlled studies with introductory programming students from multiple schools over 3 semesters, and used mixed-factor ANOVA to analyze the collected data. We found that reflecting on the concept underlying each problem neither promotes greater learning, measured as pre-post increase in the average score per problem, nor promotes faster learning, measured as the problems solved per concept learned. We conjecture that the benefits of reflecting on the concept underlying each problem may be limited if a tutor already promotes deep understanding of the domain.
Incorporating an Affective Behavior Model into an Educational Game
Hernรกndez, Yasmรญn (Instituto de Investigaciones Electricas) | Sucar, Enrique (Instituto Nacional de Astrofisica, Optica y Electronica) | Conati, Cristina (University of British Columbia)
Emotions are a ubiquitous component of motivation and learning. We have developed an affective behavior model for intelligent tutoring systems that considers both the affective and knowledge state of the student to generate tutorial actions. The affective behavior model (ABM) was designed based on teachers' expertise obtained through interviews. It relies on a dynamic decision network with a utility measure on both student learning and affect to generate tutorial actions aimed at balancing the two. We have integrated and evaluated the ABM in an educational game to learn number factorization. We carried out a controlled user study to evaluate the impact of the affective model on learning. The results show that for the younger students there is a significant improvement on learning when the affective behavior model is incorporated.
Special Track on Intelligent Tutoring Systems
Ward, Arthur (University of Pittsburgh) | Murray, Chas (Carnegie Learning)
Researchers in the field of intelligent tutoring systems (ITS) seek to create computerized tutors that can rival the learning gains produced by human tutoring, the most effective form of instruction known. The goal of the researchers is to produce ITS that provide flexible, efficient, individualized instruction to every student. Pursuit of this common goal has led them to examine many different aspects of how students learn from tutors, how human tutors interact with their students, and how students learn in collaborative environments. Insights from those studies have informed further research into ways that computer systems can detect and respond to student knowledge gaps, misconceptions, affective states and other attributes. This research has produced important work in student modeling, knowledge representation, dialog systems, and authoring tools for efficiently creating ITS in new domains.
Knowledge Engineering with Didactic Knowledge โ First Steps towards an Ultimate Goal
Knauf, Rainer (Ilmenau University of Technology) | Boeck, Ronald (University of Magdeburg) | Sakurai, Yoshitaka (Tokyo Denki University) | Tsuruta, Setsuo (Tokyo Denki University)
Generally, learning systems suffer from a lack of an explicit and adaptable didactic design. A previously introduced modeling approach called storyboarding is setting the stage to apply Knowledge Engineering Technologies to verify and validate the didactics behind a learning process. Moreover, didactics can be refined according to revealed weaknesses and proven excellence. Successful didactic patterns can be explored by applying mining techniques to the various ways students went through the storyboard and their associated level of success.
c-rater:Automatic Content Scoring for Short Constructed Responses
Sukkarieh, Jana Zuheir (Educational Testing Service) | Blackmore, John (Educational Testing Service)
The education community is moving towards constructed or free-text responses and computer-based assessment. At the same time, progress in natural language processing and knowledge representation has made it possible to consider free-text or constructed responses without having to fully understand the text. c-rater is a technology at Educational Testing Service (ETS) used for automatic content scoring for short, free-text responses. This paper describes some of the major developments made in c-rater recently.
Computational Considerations in Correcting User-Language
Renner, Adam M. (University of Memphis) | McCarthy, Philip M. (University of Memphis) | McNamara, Danielle S. (University of Memphis)
This study evaluates the robustness of established computational indices used to assess text relatedness in user-language. The original User-Language Paraphrase Corpus (ULPC) was compared to a corrected version, in which each paraphrase was corrected for typographical and grammatical errors. Error correction significantly affected values for each of five computational indices, indicating greater similarity of the target sentence to the corrected paraphrase than to the original paraphrase. Moreover, misspelled target words accounted for a large proportion of the differences. This study also evaluated potential effects on correlations between computational indices and human ratings of paraphrases. The corrections did not yield assessments that were any more or less comparable to trained human raters than were the original paraphrases containing typographical or grammatical errors. The results suggest that although correcting for errors may optimize certain computational indices, the corrections are not necessary for comparing the indices to expert ratings.
Computational Replication of Human Paraphrase Assessment
McCarthy, Philip Michael (The University of Memphis) | Cai, Zhigiang (The University of Memphis) | McNamara, Danielle S. (The University of Memphis)
Two sentences are paraphrases if their meanings are equivalent but their words and syntax are different. Paraphrasing can be used to aid comprehension, stimulate prior knowledge, and assist in writing skills development. While automated paraphrase assessment is both common-place and useful, research has centered solely on artificial, edited paraphrases and has used only binary dimensions (i.e., is or is-not a paraphrase). In this study, we use 1998 natural paraphrases generated by high school students that have been assessed along 10 dimensions of paraphrase (e.g., semantic completeness). This study investigates the components of paraphrase quality emerging from these dimensions, and examines whether computational approaches (e.g. LSA, MED) can simulate those human evaluations. The results suggest that semantic and syntactic evaluations are the primary components of paraphrase quality, and that computationally light systems such as LSA (semantics) and MED (syntax) present promising approaches to simulating human evaluations of paraphrases.
Assessment of LDAT as a Grammatical Diversity Assessment Tool
Healy, Scott Leigh (The University of Memphis) | Weintraub, Joseph D. (The University of Memphis) | McCarthy, Philip M. (The University of Memphis) | Hall, Charles E. (The University of Memphis) | McNamara, Danielle S. (The University of Memphis)
The purpose of this study is to evaluate the validity of measuring grammatical diversity with a specifically designed Lexical Diversity Assessment Tool (LDAT). A secondary objective is to use LDAT to determine if the level of difficulty assigned to English as a Second Language (ESL) texts corresponds to increases in grammatical, lexical, and temporal diversity. Other methods of lexical diversity assessment, such as type-token ratio (TTR), have been used with varying accuracy in an effort to determine the complexity or level of texts. We analyzed 120 ESL texts independently assigned by their sources to one of four levels (Beginner, Lower-intermediate, Upper-intermediate, and Advanced). We demonstrated that LDAT significantly reflected the grammatical diversity within these texts. While the findings conflicted with the prediction that grammatical and lexical diversity would increase with assigned level, we concluded that the implementation of LDAT in text design could provide reliable assessments of grammatical diversity.
From Mad Libs to Tic Tac Toe: Using Robots and Game Programming as a Theme in an Introduction to Programming Course for Non-Majors
Kay, Jennifer S. (Rowan University)
Computer Science has a bad reputation among non-CS majors. This paper describes three assignments from a gentle introduction to programming course for non-majors that uses robots and simple game programming as a hook to get students interested in the subject. In each of the assignments presented, what might be considered a trivial twist to an instructor was a key factor in making an otherwise standard project into something that is more engaging.