novice programmer
Evaluating the Quality of Code Comments Generated by Large Language Models for Novice Programmers
Fan, Aysa Xuemo, Narayanan, Arun Balajiee Lekshmi, Hassany, Mohammad, Ke, Jiaze
Large Language Models (LLMs) show promise in generating code comments for novice programmers, but their educational effectiveness remains under-evaluated. This study assesses the instructional quality of code comments produced by GPT-4, GPT-3.5-Turbo, and Llama2, compared to expert-developed comments, focusing on their suitability for novices. Analyzing a dataset of ``easy'' level Java solutions from LeetCode, we find that GPT-4 exhibits comparable quality to expert comments in aspects critical for beginners, such as clarity, beginner-friendliness, concept elucidation, and step-by-step guidance. GPT-4 outperforms Llama2 in discussing complexity (chi-square = 11.40, p = 0.001) and is perceived as significantly more supportive for beginners than GPT-3.5 and Llama2 with Mann-Whitney U-statistics = 300.5 and 322.5, p = 0.0017 and 0.0003). This study highlights the potential of LLMs for generating code comments tailored to novice programmers.
Leveraging Language Models and Automatic Summarization in Online Programming Learning Environments
Objective B. Messages in the forum are summarized using automatic natural language techniques. These summaries are intended to help students identify the errors they are having and improve their ability to ask for help. When they use the selected recommendations, students will be provided with strategies to learn how to ask better questions iteratively. "Improving question formulation" represents acquiring a deeper understanding of the topic and emerges as a key strategy for advancing programming learning. The automatic summarization was implemented by adapting a technique known as TextRank8 to the domain of the Mumuki forum in Spanish.
"It's Weird That it Knows What I Want": Usability and Interactions with Copilot for Novice Programmers
Prather, James, Reeves, Brent N., Denny, Paul, Becker, Brett A., Leinonen, Juho, Luxton-Reilly, Andrew, Powell, Garrett, Finnie-Ansley, James, Santos, Eddie Antonio
Recent developments in deep learning have resulted in code-generation models that produce source code from natural language and code-based prompts with high accuracy. This is likely to have profound effects in the classroom, where novices learning to code can now use free tools to automatically suggest solutions to programming exercises and assignments. However, little is currently known about how novices interact with these tools in practice. We present the first study that observes students at the introductory level using one such code auto-generating tool, Github Copilot, on a typical introductory programming (CS1) assignment. Through observations and interviews we explore student perceptions of the benefits and pitfalls of this technology for learning, present new observed interaction patterns, and discuss cognitive and metacognitive difficulties faced by students. We consider design implications of these findings, specifically in terms of how tools like Copilot can better support and scaffold the novice programming experience.
Teaching UML Skills to Novice Programmers Using a Sample Solution Based Intelligent Tutoring System
Schramm, Joachim (Clausthal University of Technology) | Strickroth, Sven (Clausthal University of Technology) | Le, Nguyen-Thinh (Clausthal University of Technology) | Pinkwart, Niels (Clausthal University of Technology)
Modeling skills are essential during the process of learning programming. ITS systems for modeling are typically hard to build due to the ill-definedness of most modeling tasks. This paper presents a system that can teach UML skills to novice programmers. The system is “simple and cheap” in the sense that it only requires an expert solution against which the student solutions are compared, but still flexible enough to accommodate certain degrees of solution flexibility and variability that are characteristic of modeling tasks. An empirical evaluation via a controlled lab study showed that the system worked fine and, while not leading to significant learning gains as compared to a control condition, still revealed some promising results.