Goto

Collaborating Authors

 student program


Neural Attribution for Semantic Bug-Localization in Student Programs

Neural Information Processing Systems

Providing feedback is an integral part of teaching. Most open online courses on programming make use of automated grading systems to support programming assignments and give real-time feedback. These systems usually rely on test results to quantify the programs' functional correctness. They return failing tests to the students as feedback. However, students may find it difficult to debug their programs if they receive no hints about where the bug is and how to fix it. In this work, we present NeuralBugLocator, a deep learning based technique, that can localize the bugs in a faulty program with respect to a failing test, without even running the program. At the heart of our technique is a novel tree convolutional neural network which is trained to predict whether a program passes or fails a given test. To localize the bugs, we analyze the trained network using a state-of-the-art neural prediction attribution technique and see which lines of the programs make it predict the test outcomes. Our experiments show that NeuralBugLocator is generally more accurate than two state-of-the-art program-spectrum based and one syntactic difference based bug-localization baselines.




Giving Feedback on Interactive Student Programs with Meta-Exploration

Neural Information Processing Systems

One approach toward automatic grading is to learn an agent that interacts with a student's program and explores states indicative of errors via reinforcement learning. However, existing work on this approach only provides binary feedback of whether a program is correct or not, while students require finer-grained feedback on the specific errors in their programs to understand their mistakes. In this work, we show that exploring to discover errors can be cast as a meta-exploration problem.


Reviews: Neural Attribution for Semantic Bug-Localization in Student Programs

Neural Information Processing Systems

The paper received excellent reviews and strong acceptance recommendations. However interesting the targeted application is, I personally found that the technical content was low and the level of the experimental section close to the average of the standards of Machine Learning applications. Therefore I recommend acceptance as a poster.


Neural Attribution for Semantic Bug-Localization in Student Programs

Neural Information Processing Systems

Providing feedback is an integral part of teaching. Most open online courses on programming make use of automated grading systems to support programming assignments and give real-time feedback. These systems usually rely on test results to quantify the programs' functional correctness. They return failing tests to the students as feedback. However, students may find it difficult to debug their programs if they receive no hints about where the bug is and how to fix it. In this work, we present NeuralBugLocator, a deep learning based technique, that can localize the bugs in a faulty program with respect to a failing test, without even running the program.


Next-Step Hint Generation for Introductory Programming Using Large Language Models

Roest, Lianne, Keuning, Hieke, Jeuring, Johan

arXiv.org Artificial Intelligence

Large Language Models possess skills such as answering questions, writing essays or solving programming exercises. Since these models are easily accessible, researchers have investigated their capabilities and risks for programming education. This work explores how LLMs can contribute to programming education by supporting students with automated next-step hints. We investigate prompt practices that lead to effective next-step hints and use these insights to build our StAP-tutor. We evaluate this tutor by conducting an experiment with students, and performing expert assessments. Our findings show that most LLM-generated feedback messages describe one specific next step and are personalised to the student's code and approach. However, the hints may contain misleading information and lack sufficient detail when students approach the end of the assignment. This work demonstrates the potential for LLM-generated feedback, but further research is required to explore its practical implementation.


Giving Feedback on Interactive Student Programs with Meta-Exploration

Liu, Evan Zheran, Stephan, Moritz, Nie, Allen, Piech, Chris, Brunskill, Emma, Finn, Chelsea

arXiv.org Artificial Intelligence

Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science. However, teaching and giving feedback on such software is time-consuming -- standard approaches require instructors to manually grade student-implemented interactive programs. As a result, online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs, which critically hinders students' ability to learn. One approach toward automatic grading is to learn an agent that interacts with a student's program and explores states indicative of errors via reinforcement learning. However, existing work on this approach only provides binary feedback of whether a program is correct or not, while students require finer-grained feedback on the specific errors in their programs to understand their mistakes. In this work, we show that exploring to discover errors can be cast as a meta-exploration problem. This enables us to construct a principled objective for discovering errors and an algorithm for optimizing this objective, which provides fine-grained feedback. We evaluate our approach on a set of over 700K real anonymized student programs from a Code.org interactive assignment. Our approach provides feedback with 94.3% accuracy, improving over existing approaches by 17.7% and coming within 1.5% of human-level accuracy. Project web page: https://ezliu.github.io/dreamgrader.


Prutor

Communications of the ACM

Programming education in India faces an uphill task of educating two-plus million students who enroll in degree programs with coding as a core skill.10 Fulfilling this demand presents unique challenges due to inadequate infrastructure and the unavailability of technical content in regional languages6 with unfortunate outcomes: more than 90% of Indian graduates have coding skills inadequate for IT roles, and more than 37% struggle to write code that even compiles.13 Studies indicate a steep decline in coding skills between graduates from top-100 colleges and the rest.12 While concerning, since only a tiny minority of students enroll in top-tier colleges, this is not entirely surprising. A recent study indicates that even experienced instructors at non-top-tier colleges in India struggle to write code.10


Repairing Bugs in Python Assignments Using Large Language Models

Zhang, Jialu, Cambronero, José, Gulwani, Sumit, Le, Vu, Piskac, Ruzica, Soares, Gustavo, Verbruggen, Gust

arXiv.org Artificial Intelligence

Students often make mistakes on their introductory programming assignments as part of their learning process. Unfortunately, providing custom repairs for these mistakes can require a substantial amount of time and effort from class instructors. Automated program repair (APR) techniques can be used to synthesize such fixes. Prior work has explored the use of symbolic and neural techniques for APR in the education domain. Both types of approaches require either substantial engineering efforts or large amounts of data and training. We propose to use a large language model trained on code, such as Codex, to build an APR system -- MMAPR -- for introductory Python programming assignments. Our system can fix both syntactic and semantic mistakes by combining multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking. We evaluate MMAPR on 286 real student programs and compare to a baseline built by combining a state-of-the-art Python syntax repair engine, BIFI, and state-of-the-art Python semantic repair engine for student assignments, Refactory. We find that MMAPR can fix more programs and produce smaller patches on average.