Goto

Collaborating Authors

 Rus, Vasile


Can LLMs Identify Gaps and Misconceptions in Students' Code Explanations?

arXiv.org Artificial Intelligence

This paper investigates various approaches using Large Language Models (LLMs) to identify gaps and misconceptions in students' self-explanations of specific instructional material, in our case explanations of code examples. This research is a part of our larger effort to automate the assessment of students' freely generated responses, focusing specifically on their self-explanations of code examples during activities related to code comprehension. In this work, we experiment with zero-shot prompting, Supervised Fine-Tuning (SFT), and preference alignment of LLMs to identify gaps in students' self-explanation. With simple prompting, GPT-4 consistently outperformed LLaMA3 and Mistral in identifying gaps and misconceptions, as confirmed by human evaluations. Additionally, our results suggest that fine-tuned large language models are more effective at identifying gaps in students' explanations compared to zero-shot and few-shot prompting techniques. Furthermore, our findings show that the preference optimization approach using Odds Ratio Preference Optimization (ORPO) outperforms SFT in identifying gaps and misconceptions in students' code explanations.


Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction

arXiv.org Artificial Intelligence

Predicting the strategy (sequence of concepts) that a student is likely to use in problem-solving helps Adaptive Instructional Systems (AISs) better adapt themselves to different types of learners based on their learning abilities. This can lead to a more dynamic, engaging, and personalized experience for students. To scale up training a prediction model (such as LSTMs) over large-scale education datasets, we develop a non-parametric approach to cluster symmetric instances in the data. Specifically, we learn a representation based on Node2Vec that encodes symmetries over mastery or skill level since, to solve a problem, it is natural that a student's strategy is likely to involve concepts in which they have gained mastery. Using this representation, we use DP-Means to group symmetric instances through a coarse-to-fine refinement of the clusters. We apply our model to learn strategies for Math learning from large-scale datasets from MATHia, a leading AIS for middle-school math learning. Our results illustrate that our approach can consistently achieve high accuracy using a small sample that is representative of the full dataset. Further, we show that this approach helps us learn strategies with high accuracy for students at different skill levels, i.e., leveraging symmetries improves fairness in the prediction model.


Automated Assessment of Students' Code Comprehension using LLMs

arXiv.org Artificial Intelligence

Assessing student's answers and in particular natural language answers is a crucial challenge in the field of education. Advances in machine learning, including transformer-based models such as Large Language Models(LLMs), have led to significant progress in various natural language tasks. Nevertheless, amidst the growing trend of evaluating LLMs across diverse tasks, evaluating LLMs in the realm of automated answer assesment has not received much attention. To address this gap, we explore the potential of using LLMs for automated assessment of student's short and open-ended answer. Particularly, we use LLMs to compare students' explanations with expert explanations in the context of line-by-line explanations of computer programs. For comparison purposes, we assess both Large Language Models (LLMs) and encoder-based Semantic Textual Similarity (STS) models in the context of assessing the correctness of students' explanation of computer code. Our findings indicate that LLMs, when prompted in few-shot and chain-of-thought setting perform comparable to fine-tuned encoder-based models in evaluating students' short answers in programming domain.


The Behavior of Large Language Models When Prompted to Generate Code Explanations

arXiv.org Artificial Intelligence

This paper systematically investigates the generation of code explanations by Large Language Models (LLMs) for code examples commonly encountered in introductory programming courses. Our findings reveal significant variations in the nature of code explanations produced by LLMs, influenced by factors such as the wording of the prompt, the specific code examples under consideration, the programming language involved, the temperature parameter, and the version of the LLM. However, a consistent pattern emerges for Java and Python, where explanations exhibit a Flesch-Kincaid readability level of approximately 7-8 grade and a consistent lexical density, indicating the proportion of meaningful words relative to the total explanation size. Additionally, the generated explanations consistently achieve high scores for correctness, but lower scores on three other metrics: completeness, conciseness, and specificity.


Scalable and Equitable Math Problem Solving Strategy Prediction in Big Educational Data

arXiv.org Artificial Intelligence

Understanding a student's problem-solving strategy can have a significant impact on effective math learning using Intelligent Tutoring Systems (ITSs) and Adaptive Instructional Systems (AISs). For instance, the ITS/AIS can better personalize itself to correct specific misconceptions that are indicated by incorrect strategies, specific problems can be designed to improve strategies and frustration can be minimized by adapting to a student's natural way of thinking rather than trying to fit a standard strategy for all. While it may be possible for human experts to identify strategies manually in classroom settings with sufficient student interaction, it is not possible to scale this up to big data. Therefore, we leverage advances in Machine Learning and AI methods to perform scalable strategy prediction that is also fair to students at all skill levels. Specifically, we develop an embedding called MVec where we learn a representation based on the mastery of students. We then cluster these embeddings with a non-parametric clustering method where we progressively learn clusters such that we group together instances that have approximately symmetrical strategies. The strategy prediction model is trained on instances sampled from these clusters. This ensures that we train the model over diverse strategies and also that strategies from a particular group do not bias the DNN model, thus allowing it to optimize its parameters over all groups. Using real world large-scale student interaction datasets from MATHia, we implement our approach using transformers and Node2Vec for learning the mastery embeddings and LSTMs for predicting strategies. We show that our approach can scale up to achieve high accuracy by training on a small sample of a large dataset and also has predictive equality, i.e., it can predict strategies equally well for learners at diverse skill levels.


Attention Based Transformer for Student Answers Assessment

AAAI Conferences

Inspired by Vaswani’s transformer, we propose in this paper an attention-based transformer neural network with a multi-head attention mechanism for the task of student answer assessment. Results show the competitiveness of our proposed model. A highest accuracy of 71.5% was achieved when using ELMo embeddings, 10 heads of attention, and 2 layers. This is very competitive and rivals the highest accuracy achieved by a previously proposed BI-GRU-Capsnet deep network (72.5%) on the same dataset. The main advantages of using transformers over BI-GRU-Capsnet is reducing the training time and giving more space for parallelization.


Experiments with a Socratic Intelligent Tutoring System for Source Code Understanding

AAAI Conferences

Computer Science (CS) education is critical in todays world, and introductory programming courses are considered extremely difficult and frustrating, often considered a major stumbling block for students willing to pursue computer programming related careers. In this paper, we describe the design of Socratic Tutor, an Intelligent Tutoring System that can help novice programmers to better understand programming concepts. The system was inspired by the Socratic method of teaching in which the main goal is to ask a set of guiding questions about key concepts and major steps or segments of complete code examples. To evaluate the Socratic Tutor, we conducted a pilot study with 34 computer science students and the results are promising in terms of learning gains.


A Conversational Intelligent Agent for Career Guidance and Counseling

AAAI Conferences

Navigating a career constitutes one of life’s most enduring challenges, particularly within a unique organization like the US Navy. While the Navy has numerous resources for guidance, accessing and identifying key information sources across the many existing platforms can be challenging for sailors (e.g., determining the appropriate program or point of contact, developing an accurate understanding of the process, and even recognizing the need for planning itself). Focusing on intermediate goals, evaluations, education, certifications, and training is quite demanding, even before considering their cumulative long-term implications. These are on top of generic personal issues, such as financial difficulties and homesickness when at sea for prolonged periods. We present the preliminary construction of a conversational intelligent agent designed to provide a user-friendly, adaptive environment that recognizes user input pertinent to these issues and provides guidance to appropriate resources within the Navy. User input from “counseling sessions” is linked, using advanced natural language processing techniques, to our framework of Navy training and education standards, promotion protocols, and organizational structure, producing feedback on resources and recommendations sensitive to user history and stated career goals. The proposed innovative technology monitors sailors’ career progress, proactively triggering sessions before major career milestones or when performance drops below Navy expectations, by using a mixed-initiative design. System-triggered sessions involve positive feedback and informative dialogues (using existing Navy career guidance protocols). The intelligent agent also offers counseling for personal problems, triggering targeted dialogues designed to gather more information, offer tailored suggestions, and provide referrals to appropriate resources or to a human counselor when in-depth counseling is warranted. This software, currently in alpha testing, has the potential to serve as a centralized information hub, engaging and encouraging sailors to take ownership of their career paths in the most efficient way possible, benefiting both individuals and the Navy as a whole.


Report on the Thirty-First International Florida Artificial Intelligence Research Society Conference (FLAIRS-31)

AI Magazine

The Thirty-First International Florida Artificial Intelligence Research Society Conference (FLAIRS-31) was held May 21-23, 2018, at the Crowne Plaza Oceanfront in Melbourne, Florida, USA. The conference events included invited speakers, special tracks, and presentations of papers, posters, and awards. The conference chair was Zdravko Markov from Central Connecticut State University. The program co-chairs were Vasile Rus from the University of Memphis and Keith Brawner from the Army Research Laboratory. The special tracks were coordinated by Roman Barták from Charles University in Prague.