graesser
"There Is No Such Thing as a Dumb Question," But There Are Good Ones
Shin, Minjung, Kim, Donghyun, Ryu, Jeh-Kwang
Questioning has become increasingly crucial for both humans and artificial intelligence, yet there remains limited research comprehensively assessing question quality. In response, this study defines good questions and presents a systematic evaluation framework. We propose two key evaluation dimensions: appropriateness (sociolinguistic competence in context) and effectiveness (strategic competence in goal achievement). Based on these foundational dimensions, a rubric-based scoring system was developed. By incorporating dynamic contextual variables, our evaluation framework achieves structure and flexibility through semi-adaptive criteria. The methodology was validated using the CAUS and SQUARE datasets, demonstrating the ability of the framework to access both well-formed and problematic questions while adapting to varied contexts. As we establish a flexible and comprehensive framework for question evaluation, this study takes a significant step toward integrating questioning behavior with structured analytical methods grounded in the intrinsic nature of questioning.
CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models
Shin, Minjung, Kim, Donghyun, Ryu, Jeh-Kwang
We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. Leveraging this dataset, we investigate the potential of LLMs to engage in questioning effectively. Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of reasoning and queries. The queries are then classified according to multi-dimensional criteria. All procedures are facilitated by a collaborative system involving both LLMs and human researchers. Our results demonstrate that GPT-4 can effectively generate pertinent questions and grasp their nuances, particularly when given appropriate context and instructions. The study suggests that incorporating human-like questioning into AI models improves their ability to manage uncertainties, paving the way for future advancements in Artificial Intelligence (AI).
Ruffle&Riley: Towards the Automated Induction of Conversational Tutoring Systems
Schmucker, Robin, Xia, Meng, Azaria, Amos, Mitchell, Tom
Conversational tutoring systems (CTSs) offer learning experiences driven by natural language interaction. They are known to promote high levels of cognitive engagement and benefit learning outcomes, particularly in reasoning tasks. Nonetheless, the time and cost required to author CTS content is a major obstacle to widespread adoption. In this paper, we introduce a novel type of CTS that leverages the recent advances in large language models (LLMs) in two ways: First, the system induces a tutoring script automatically from a lesson text. Second, the system automates the script orchestration via two LLM-based agents (Ruffle&Riley) with the roles of a student and a professor in a learning-by-teaching format. The system allows a free-form conversation that follows the ITS-typical inner and outer loop structure. In an initial between-subject online user study (N = 100) comparing Ruffle&Riley to simpler QA chatbots and reading activity, we found no significant differences in post-test scores. Nonetheless, in the learning experience survey, Ruffle&Riley users expressed higher ratings of understanding and remembering and further perceived the offered support as more helpful and the conversation as coherent. Our study provides insights for a new generation of scalable CTS technologies.
Evaluation of mathematical questioning strategies using data collected through weak supervision
Datta, Debajyoti, Phillips, Maria, Bywater, James P, Chiu, Jennifer, Watson, Ginger S., Barnes, Laura E., Brown, Donald E
A large body of research demonstrates how teachers' questioning strategies can improve student learning outcomes. However, developing new scenarios is challenging because of the lack of training data for a specific scenario and the costs associated with labeling. This paper presents a high-fidelity, AI-based classroom simulator to help teachers rehearse research-based mathematical questioning skills. Using a human-in-the-loop approach, we collected a high-quality training dataset for a mathematical questioning scenario. Using recent advances in uncertainty quantification, we evaluated our conversational agent for usability and analyzed the practicality of incorporating a human-in-the-loop approach for data collection and system evaluation for a mathematical questioning scenario.
Automated Personalized Feedback Improves Learning Gains in an Intelligent Tutoring System
Kochmar, Ekaterina, Vu, Dung Do, Belfer, Robert, Gupta, Varun, Serban, Iulian Vlad, Pineau, Joelle
We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with personalized hints, Wikipedia-based explanations, and mathematical hints. Our model is used in Korbit, a large-scale dialogue-based ITS with thousands of students launched in 2019, and we demonstrate that the personalized feedback leads to considerable improvement in student learning outcomes and in the subjective evaluation of the feedback.
Recent Advances in Conversational Intelligent Tutoring Systems
We highlight progress in terms of macro-and microadaptivity. Macroadaptivity refers to a system's capability to select appropriate instructional tasks for the learner to work on. Microadaptivity refers to a system's capability to adapt its scaffolding while the learner is working on a particular task. The advances in macro-and microadaptivity that are presented here were made possible by the use of learning progressions, deeper dialogue, and natural language-processing techniques, and by the use of affect-enabled components. Learning progressions and deeper dialogue and natural language-processing techniques are key features of Deep-Tutor, the first intelligent tutoring system based on learning progressions.
How 'Intelligent' Tutors Could Transform Teaching
Schools may be critiqued as "factories," but robots aren't going to replace human teachers any time soon. Still, that doesn't mean that artificially intelligent systems won't transform education just as they are changing a variety of fields and practices, from the way oncologists diagnose cancer to how lawyers analyze cases. Intelligent-tutoring systems like ALEKS (for Assessment and LEarning in Knowledge Spaces), Cognitive Tutor, and a new program in development by IBM's Watson initiative are starting to expand in K-12 education, and experts argue that teachers need new training not only to use intelligent systems in the classroom but also to prepare students for careers in increasingly technology-integrated fields. "Any skill that a computer can teach is going to be done by a computer in the workplace, and that's something people don't think about enough," said Christopher Dede, an education and technology professor at the Harvard Graduate School of Education. For that reason, he said, teachers can use computer programs not simply to replace pieces of their instruction, but to model for students how to work with technology professionally.
DeepTutor: An Effective, Online Intelligent Tutoring System That Promotes Deep Learning
Rus, Vasile (The University of Memphis) | Niraula, Nobal (The University of Memphis) | Banjade, Rajendra (The University of Memphis)
We present in this paper an innovative solution to the challenge of building effective educational technologies that offer tailored instruction to each individual learner. The proposed solution in the form of a conversational intelligent tutoring system, called DeepTutor, has been developed as a web application that is accessible 24/7 through a browser from any device connected to the Internet. The success of several large scale experiments with high-school students using DeepTutor is a solid proof that conversational intelligent tutoring at scale over the web is possible.
Recent Advances in Conversational Intelligent Tutoring Systems
Rus, Vasile (The University of Memphis) | D’Mello, Sidney (University of Notre-Dame) | Hu, Xiangen (The University of Memphis) | Graesser, Arthur (The University of Memphis)
We report recent advances in intelligent tutoring systems with conversational dialogue. We highlight progress in terms of macro and microadaptivity. Macroadaptivity refers to a system’s capability to select appropriate instructional tasks for the learner to work on. Microadaptivity refers to a system’s capability to adapt its scaffolding while the learner is working on a particular task. The advances in macro and microadaptivity that are presented here were made possible by the use of learning progressions, deeper dialogue and natural language processing techniques, and by the use of affect-enabled components. Learning progressions and deeper dialogue and natural language processing techniques are key features of DeepTutor, the first intelligent tutoring system based on learning progressions. These improvements extend the bandwidth of possibilities for tailoring instruction to each individual student which is needed for maximizing engagement and ultimately learning.
Malleability of Students’ Perceptions of an Affect-Sensitive Tutor and Its Influence on Learning
D' (University of Notre Dame) | Mello, Sidney (University of Memphis) | Graesser, Art
We evaluated an affect-sensitive version of AutoTutor, a dialogue based ITS that simulates human tutors. While the original AutoTutor is sensitive to students’ cognitive states, the affect-sensitive tutor (Supportive tutor) also responds to students’ affective states (boredom, confusion, and frustration) with empathetic, encouraging, and motivational dialogue moves that are accompanied by appropriate emotional expressions. We conducted an experiment that compared the Supportive and Regular (non-affective) tutors over two 30-minute learning sessions with respect to perceived effectiveness, fidelity of cognitive and emotional feedback, engagement, and enjoyment. The results indicated that, irrespective of tutor, students’ ratings of engagement, enjoyment, and perceived learning decreased across sessions, but these ratings were not correlated with actual learning gains. In contrast, students’ perceptions of how closely the computer tutors resembled human tutors increased across learning sessions, was related to the quality of tutor feedback, the increase was greater for the Supportive tutor, and was a powerful predictor of learning. Implications of our findings for the design of affect-sensitive ITSs are discussed.