essay prompt
EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing
Gao, Fan, Li, Dongyuan, Xia, Ding, Mi, Fei, Wang, Yasheng, Shang, Lifeng, Wang, Baojun
Chinese essay writing and its evaluation are critical in educational contexts, yet the capabilities of Large Language Models (LLMs) in this domain remain largely underexplored. Existing benchmarks often rely on coarse-grained text quality metrics, largely overlooking the structural and rhetorical complexities of Chinese essays, particularly across diverse genres. To address this gap, we propose \benchName, a multi-genre benchmark specifically designed for Chinese essay writing across four major genres: Argumentative, Narrative, Descriptive, and Expository. We curate and refine a total of 728 real-world prompts to ensure authenticity and meticulously categorize them into the \textit{Open-Ended} and \textit{Constrained} sets to capture diverse writing scenarios. To reliably evaluate generated essays, we develop a fine-grained, genre-specific scoring framework that hierarchically aggregates scores. We further validate our evaluation protocol through a comprehensive human agreement study. Finally, we benchmark 15 large-sized LLMs, analyzing their strengths and limitations across genres and instruction types. With \benchName, we aim to advance LLM-based Chinese essay evaluation and inspire future research on improving essay generation in educational settings.
I'm a teacher and this is the simple way I can tell if students have used AI to cheat in their essays
With ChatGPT and Bard both becoming more and more popular, many students are being tempted to use AI chatbots to cheat on their essays. But one teacher has come up with a clever trick dubbed the'Trojan Horse' to catch them out. In a TikTok video, Daina Petronis, an English language teacher from Toronto, shows how she can easily spot AI essays. By putting a hidden prompt into her assignments, Ms Petronis tricks the AI into including unusual words which she can quickly find. 'Since no plagiarism detector is 100% accurate, this method is one of the few ways we can locate concrete evidence and extend our help to students who need guidance with AI,' Ms Petronis said.
ArguGPT: evaluating, understanding and identifying argumentative essays generated by GPT models
Liu, Yikang, Zhang, Ziyin, Zhang, Wanyang, Yue, Shisen, Zhao, Xiaojing, Cheng, Xinyuan, Zhang, Yiwen, Hu, Hai
AI generated content (AIGC) presents considerable challenge to educators around the world. Instructors need to be able to detect such text generated by large language models, either with the naked eye or with the help of some tools. There is also growing need to understand the lexical, syntactic and stylistic features of AIGC. To address these challenges in English language teaching, we first present ArguGPT, a balanced corpus of 4,038 argumentative essays generated by 7 GPT models in response to essay prompts from three sources: (1) in-class or homework exercises, (2) TOEFL and (3) GRE writing tasks. Machine-generated texts are paired with roughly equal number of human-written essays with three score levels matched in essay prompts. We then hire English instructors to distinguish machine essays from human ones. Results show that when first exposed to machine-generated essays, the instructors only have an accuracy of 61% in detecting them. But the number rises to 67% after one round of minimal self-training. Next, we perform linguistic analyses of these essays, which show that machines produce sentences with more complex syntactic structures while human essays tend to be lexically more complex. Finally, we test existing AIGC detectors and build our own detectors using SVMs and RoBERTa. Results suggest that a RoBERTa fine-tuned with the training set of ArguGPT achieves above 90% accuracy in both essay- and sentence-level classification. To the best of our knowledge, this is the first comprehensive analysis of argumentative essays produced by generative large language models. Machine-authored essays in ArguGPT and our models will be made publicly available at https://github.com/huhailinguist/ArguGPT
To Teach Better Writing, Don't Ban Artificial Intelligence. Instead, Embrace it. - Education Next
For all the speculation about ChatGPT's potential to upend Kโ12 writing instruction, there has been little investigation into the underlying assumption that the AI chatbot can produce writing that makes the grade. We put OpenAI's ChatGPT to the test by asking it to write essays in response to real school curriculum prompts. We then submitted those essays for evaluation. The results show that ChatGPT produces responses that meet or exceed standards across grade levels. This has big implications for schools, which should move with urgency to adjust their practices and learning models to keep pace with the shifting technological landscape.
ChatGPT can write English essays โฆ quite well. How are teachers going to deal? - Marketplace
Teachers are a creative bunch. They have to be to come up with lesson plans and exams that help students grow their minds and prevent those same students from relying too much on technology to enhance their work or to cheat. Which is why the rollout of OpenAI's ChatGPT has many teachers worried. The chatbot can answer almost any type of question, even if the answers aren't always accurate. Marketplace's Kimberly Adams spoke with Daniel Herman, an English teacher at Maybeck High School in Berkeley, California.