Goto

Collaborating Authors

 ai-generated essay


Evaluating AI-Generated Essays with GRE Analytical Writing Assessment

Zhong, Yang, Hao, Jiangang, Fauss, Michael, Li, Chen, Wang, Yuan

arXiv.org Artificial Intelligence

The recent revolutionary advance in generative AI enables the generation of realistic and coherent texts by large language models (LLMs). Despite many existing evaluation metrics on the quality of the generated texts, there is still a lack of rigorous assessment of how well LLMs perform in complex and demanding writing assessments. This study examines essays generated by ten leading LLMs for the analytical writing assessment of the Graduate Record Exam (GRE). We assessed these essays using both human raters and the e-rater automated scoring engine as used in the GRE scoring pipeline. Notably, the top-performing Gemini and GPT-4o received an average score of 4.78 and 4.67, respectively, falling between "generally thoughtful, well-developed analysis of the issue and conveys meaning clearly" and "presents a competent analysis of the issue and conveys meaning with acceptable clarity" according to the GRE scoring guideline. We also evaluated the detection accuracy of these essays, with detectors trained on essays generated by the same and different LLMs.


Decoding AI and Human Authorship: Nuances Revealed Through NLP and Statistical Analysis

Akinwande, Mayowa, Adeliyi, Oluwaseyi, Yussuph, Toyyibat

arXiv.org Artificial Intelligence

This research explores the nuanced differences in texts produced by AI and those written by humans, aiming to elucidate how language is expressed differently by AI and humans. Through comprehensive statistical data analysis, the study investigates various linguistic traits, patterns of creativity, and potential biases inherent in human-written and AI- generated texts. The significance of this research lies in its contribution to understanding AI's creative capabilities and its impact on literature, communication, and societal frameworks. By examining a meticulously curated dataset comprising 500K essays spanning diverse topics and genres, generated by LLMs, or written by humans, the study uncovers the deeper layers of linguistic expression and provides insights into the cognitive processes underlying both AI and human-driven textual compositions. The analysis revealed that human-authored essays tend to have a higher total word count on average than AI-generated essays but have a shorter average word length compared to AI- generated essays, and while both groups exhibit high levels of fluency, the vocabulary diversity of Human authored content is higher than AI generated content. However, AI- generated essays show a slightly higher level of novelty, suggesting the potential for generating more original content through AI systems. The paper addresses challenges in assessing the language generation capabilities of AI models and emphasizes the importance of datasets that reflect the complexities of human-AI collaborative writing. Through systematic preprocessing and rigorous statistical analysis, this study offers valuable insights into the evolving landscape of AI-generated content and informs future developments in natural language processing (NLP).


Hidding the Ghostwriters: An Adversarial Evaluation of AI-Generated Student Essay Detection

Peng, Xinlin, Zhou, Ying, He, Ben, Sun, Le, Sun, Yingfei

arXiv.org Artificial Intelligence

Large language models (LLMs) have exhibited remarkable capabilities in text generation tasks. However, the utilization of these models carries inherent risks, including but not limited to plagiarism, the dissemination of fake news, and issues in educational exercises. Although several detectors have been proposed to address these concerns, their effectiveness against adversarial perturbations, specifically in the context of student essay writing, remains largely unexplored. This paper aims to bridge this gap by constructing AIG-ASAP, an AI-generated student essay dataset, employing a range of text perturbation methods that are expected to generate high-quality essays while evading detection. Through empirical experiments, we assess the performance of current AIGC detectors on the AIG-ASAP dataset. The results reveal that the existing detectors can be easily circumvented using straightforward automatic adversarial attacks. Specifically, we explore word substitution and sentence substitution perturbation methods that effectively evade detection while maintaining the quality of the generated essays. This highlights the urgent need for more accurate and robust methods to detect AI-generated student essays in the education domain.


Can ChatGPT get into Harvard? We tested its admissions essay.

Washington Post - Technology News

ChatGPT's release a year ago triggered a wave of panic among educators. Now, universities are in the midst of college application season, concerned that students might use the artificial intelligence tool to forge admissions essays. But is a chatbot-created essay good enough to fool college admissions counselors? To find out, The Washington Post asked a prompt engineer -- an expert at directing AI chatbots -- to create college essays using ChatGPT. The chatbot produced two essays: one responding to a question from the Common Application, which thousands of colleges use for admissions, and one answering a prompt used solely for applicants to Harvard University.


AI-generated essays are nothing to worry about (opinion)

#artificialintelligence

September 2022 was apparently the month artificial intelligence essay angst boiled over in academia, as various media outlets published opinion pieces lamenting the rise of AI writing systems that will ruin student writing and pave the way toward unprecedented levels of academic misconduct. Then, on Sept. 23, academic Twitter exploded into a bit of a panic on this topic. The firestorm was prompted by a post to the OpenAI subreddit where user Urdadgirl69 claimed to be getting straight A's with essays "written" using artificial intelligence. Professors on Reddit and Twitter alike expressed frustration and concern about how best to address the threat of AI essays. One of the most poignant and widely retweeted laments came from Redditor ahumanlikeyou, who wrote, "Grading something an AI wrote is an incredibly depressing waste of my life."