ai submission
University examiners fail to spot ChatGPT answers in real-world test
Ninety-four per cent of university exam submissions created using ChatGPT weren't detected as being generated by artificial intelligence, and these submissions tended to get higher scores than real students' work. Peter Scarfe at the University of Reading, UK, and his colleagues used ChatGPT to produce answers to 63 assessment questions on five modules across the university's psychology undergraduate degrees. Students sat these exams at home, so they were allowed to look at notes and references, and they could potentially have used AI although this wasn't permitted. How this moment for AI will change society forever (and how it won't) The AI-generated answers were submitted alongside real students' work, and accounted for, on average, 5 per cent of the total scripts marked by academics. The markers weren't informed that they were checking the work of 33 fake students – whose names were themselves generated by ChatGPT.
Student Mastery or AI Deception? Analyzing ChatGPT's Assessment Proficiency and Evaluating Detection Strategies
Wang, Kevin, Akins, Seth, Mohammed, Abdallah, Lawrence, Ramon
Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment. Computer science requires practice to develop skills in problem solving and programming that are traditionally developed using assignments. Generative AI has the capability of completing these assignments for students with high accuracy, which dramatically increases the potential for academic integrity issues and students not achieving desired learning outcomes. This work investigates the performance of ChatGPT by evaluating it across three courses (CS1,CS2,databases). ChatGPT completes almost all introductory assessments perfectly. Existing detection methods, such as MOSS and JPlag (based on similarity metrics) and GPTzero (AI detection), have mixed success in identifying AI solutions. Evaluating instructors and teaching assistants using heuristics to distinguish between student and AI code shows that their detection is not sufficiently accurate. These observations emphasize the need for adapting assessments and improved detection methods.
The Economist's essay contest featured an AI submission. Here's what the judges thought.
Earlier this summer, the Economist announced a competition for young people. They asked contestants to answer this question: "What fundamental economic and political change, if any, is needed for an effective response to climate change?" More than 2,400 people responded, from over 110 countries. And the Economist slipped one essay into the stack of submissions that their judges would review: an essay written by an artificial intelligence. The AI in question was GPT-2, a language-generating system developed by San Francisco AI lab OpenAI and announced this spring.