AITopics | Student Performance

Collaborating Authors

Student Performance

Densely Connected Attention Propagation for Reading Comprehension

Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su

Neural Information Processing SystemsMar-26-2025, 09:22:11 GMT

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

North America (0.46)
Asia (0.28)

Industry: Education > Assessment & Standards > Student Performance (0.42)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Amazon's AI-generated summary of popular conservative book accuses it of 'extreme' rhetoric

FOX NewsMar-26-2025, 09:00:37 GMT

Markowicz previously explained why they wrote the book in a Fox News Digital opinion piece, noting that in 2021, then-Democratic Virginia gubernatorial candidate Terry McAuliffe said, "I don't think parents should be telling schools what they should teach." "Taken on its own, the comment might even be benign. Sure, parental involvement in education had always been a prediction of student success. A 2010 study called'Parent Involvement and Student Academic Performance: A Multiple Mediational Analysis' by researchers at the Warren Alpert Medical School of Brown University and the University of North Carolina at Greensboro found'children whose parents are more involved in their education have higher levels of academic performance than children whose parents are involved to a lesser degree." But should parents be designing a curriculum?

amazon, artificial intelligence, fox new digital, (12 more...)

FOX News

Country:

North America > United States > Virginia (0.25)
North America > United States > North Carolina (0.25)

Industry:

Education > Educational Setting > Higher Education (0.56)
Education > Assessment & Standards > Student Performance (0.36)

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

Investigating Recent Large Language Models for Vietnamese Machine Reading Comprehension

Nguyen, Anh Duc, Phi, Hieu Minh, Ngo, Anh Viet, Trieu, Long Hai, Nguyen, Thai Phuong

arXiv.org Artificial IntelligenceMar-23-2025

Large Language Models (LLMs) have shown remarkable proficiency in Machine Reading Comprehension (MRC) tasks; however, their effectiveness for low-resource languages like Vietnamese remains largely unexplored. In this paper, we fine-tune and evaluate two state-of-the-art LLMs: Llama 3 (8B parameters) and Gemma (7B parameters), on ViMMRC, a Vietnamese MRC dataset. By utilizing Quantized Low-Rank Adaptation (QLoRA), we efficiently fine-tune these models and compare their performance against powerful LLM-based baselines. Although our fine-tuned models are smaller than GPT-3 and GPT-3.5, they outperform both traditional BERT-based approaches and these larger models. This demonstrates the effectiveness of our fine-tuning process, showcasing how modern LLMs can surpass the capabilities of older models like BERT while still being suitable for deployment in resource-constrained environments. Through intensive analyses, we explore various aspects of model performance, providing valuable insights into adapting LLMs for low-resource languages like Vietnamese. Our study contributes to the advancement of natural language processing in low-resource languages, and we make our fine-tuned models publicly available at: https://huggingface.co/iaiuet.

large language model, llama 3, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2503.18062

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education > Assessment & Standards > Student Performance (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Texas private school's use of new 'AI tutor' rockets student test scores to top 2% in the country

FOX NewsMar-22-2025, 14:01:24 GMT

Alpha School co-founder Mackenzie Price and a junior at the school, Elle Kristine, join'Fox & Friends' to discuss the benefits of incorporating artificial intelligence into the classroom. A Texas private school is seeing student test scores soar to new heights following the implementation of an artificial intelligence (AI) "tutor." At Alpha School in Austin, Texas, students are placed in the classroom for two hours a day with an AI assistant, using the rest of the day to focus on skills like public speaking, financial literacy, and teamwork. "We use an AI tutor and adaptive apps to provide a completely personalized learning experience for all of our students, and as a result our students are learning faster, they're learning way better. In fact, our classes are in the top 2% in the country," Alpha School co-founder Mackenzie Price told "Fox & Friends." Will A.I. make schools'obsolete,' or does it present a new'opportunity' for the education system?

alpha school, artificial intelligence, texas private school, (8 more...)

FOX News

Country: North America > United States > Texas > Travis County > Austin (0.26)

Industry:

Education > Educational Setting (1.00)
Education > Assessment & Standards > Student Performance (0.62)
Education > Educational Technology > Educational Software > Computer Based Training (0.38)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

Qwaider, Chatrine, Alhafni, Bashar, Chirkunov, Kirill, Habash, Nizar, Briscoe, Ted

arXiv.org Artificial IntelligenceMar-22-2025

Automated Essay Scoring (AES) plays a crucial role in assessing language learners' writing quality, reducing grading workload, and providing real-time feedback. Arabic AES systems are particularly challenged by the lack of annotated essay datasets. This paper presents a novel framework leveraging Large Language Models (LLMs) and Transformers to generate synthetic Arabic essay datasets for AES. We prompt an LLM to generate essays across CEFR proficiency levels and introduce controlled error injection using a fine-tuned Standard Arabic BERT model for error type prediction. Our approach produces realistic human-like essays, contributing a dataset of 3,040 annotated essays. Additionally, we develop a BERT-based auto-marking system for accurate and scalable Arabic essay evaluation. Experimental results demonstrate the effectiveness of our framework in improving Arabic AES performance.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.17739

Country:

Asia > Thailand (0.14)
Europe > Ukraine (0.14)
Europe > Germany (0.14)
Europe > France (0.14)

Genre:

Overview (0.93)
Research Report > New Finding (0.48)
Personal > Interview (0.46)

Industry:

Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.71)
Education > Educational Technology > Educational Software > Computer Based Training (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Bridging the LLM Accessibility Divide? Performance, Fairness, and Cost of Closed versus Open LLMs for Automated Essay Scoring

Oketch, Kezia, Lalor, John P., Yang, Yi, Abbasi, Ahmed

arXiv.org Artificial IntelligenceMar-14-2025

The rapid development of machine learning (ML) technologies, particularly large language models (LLMs), has led to major advancements in natural language processing (NLP, Abbasi et al. 2023). While much of this advancement happened under the umbrella of the common task framework which espouses transparency and openness (Abbasi et al. 2023), in recent years, closed LLMs such as GPT-3 and GPT-4 have set new performance standards in tasks ranging from text generation to question answering, demonstrating unprecedented capabilities in zero-shot and few-shot learning scenarios (Brown et al. 2020, OpenAI 2023). Given the strong performance of closed LLMs such as GPT-4, many studies within the LLM-as-a-judge paradigm rely on their scores as ground truth benchmarks for evaluating both open and closed LLMs (Chiang and Lee 2023), further entrenching the dominance of SOTA closed LLMs (Vergho et al. 2024). Along with closed LLMs, there are also LLMs where the pre-trained models (i.e., training weights) and inference code are publicly available ("open LLMs") such as Llama (Touvron et al. 2023, Dubey et al. 2024) as well as LLMs where the full training data and training code are also available ("open-source LLMs") such as OLMo (Groeneveld et al. 2024). Open and open-source LLMs provide varying levels of transparency for developers and researchers (Liu et al. 2023). Access to model weights, training data, and inference code enables several benefits for the user-developer-researcher community, including lower costs per input/output token through third-party API services, support for local/offline pre-training and fine-tuning, and deeper analysis of model biases and debiasing strategies. However, the dominance of closed LLMs raises a number of concerns, including accessibility and fairness (Strubell et al. 2020, Bender 2021, Irugalbandara et al. 2024).

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.11827

Country: Asia > Thailand (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (0.93)
Research Report > Experimental Study (0.68)

Industry:

Education > Assessment & Standards > Student Performance (0.89)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Transfer Learning for Automated Feedback Generation on Small Datasets

Morris, Oscar

arXiv.org Artificial IntelligenceMar-14-2025

Feedback is a very important part the learning process. However, it is challenging to make this feedback both timely and accurate when relying on human markers. This is the challenge that Automated Feedback Generation attempts to address. In this paper, a technique to train such a system on a very small dataset with very long sequences is presented. Both of these attributes make this a very challenging task, however, by using a three stage transfer learning pipeline state-of-the-art results can be achieved with qualitatively accurate but unhuman sounding results. The use of both Automated Essay Scoring and Automated Feedback Generation systems in the real world is also discussed.

artificial intelligence, automated feedback generation, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2503.11836

Genre: Research Report (0.69)

Industry:

Education > Educational Technology > Educational Software (0.53)
Education > Assessment & Standards > Student Performance (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)

Add feedback

MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark

Ma, Shengkun, Peng, Hao, Hou, Lei, Li, Juanzi

arXiv.org Artificial IntelligenceMar-10-2025

Machine Reading Comprehension (MRC) is an essential task in evaluating natural language understanding. Existing MRC datasets primarily assess specific aspects of reading comprehension (RC), lacking a comprehensive MRC benchmark. To fill this gap, we first introduce a novel taxonomy that categorizes the key capabilities required for RC. Based on this taxonomy, we construct MRCEval, an MRC benchmark that leverages advanced Large Language Models (LLMs) as both sample generators and selection judges. MRCEval is a comprehensive, challenging and accessible benchmark designed to assess the RC capabilities of LLMs thoroughly, covering 13 distinct RC skills with a total of 2.1K high-quality multi-choice questions. We perform an extensive evaluation of 28 widely used open-source and proprietary models, highlighting that MRC continues to present significant challenges even in the era of LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.07144

Country:

Europe (1.00)
Asia (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.64)

Industry: Education > Assessment & Standards > Student Performance (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation

Nguyen, Bang, Du, Tingting, Yu, Mengxia, Angrave, Lawrence, Jiang, Meng

arXiv.org Artificial IntelligenceMar-7-2025

While the Question Generation (QG) task has been increasingly adopted in educational assessments, its evaluation remains limited by approaches that lack a clear connection to the educational values of test items. In this work, we introduce test item analysis, a method frequently used by educators to assess test question quality, into QG evaluation. Specifically, we construct pairs of candidate questions that differ in quality across dimensions such as topic coverage, item difficulty, item discrimination, and distractor efficiency. We then examine whether existing QG evaluation approaches can effectively distinguish these differences. Our findings reveal significant shortcomings in these approaches with respect to accurately assessing test item quality in relation to student performance. To address this gap, we propose a novel QG evaluation framework, QG-SMS, which leverages Large Language Model for Student Modeling and Simulation to perform test item analysis. As demonstrated in our extensive experiments and human evaluation study, the additional perspectives introduced by the simulated student profiles lead to a more effective and robust assessment of test items.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.05888

Country:

Europe (0.93)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > UAE (0.14)
(3 more...)

Genre:

Research Report > New Finding (0.34)
Instructional Material > Course Syllabus & Notes (0.31)

Industry:

Education > Educational Technology > Educational Software (0.61)
Education > Assessment & Standards > Student Performance (0.51)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

MathMistake Checker: A Comprehensive Demonstration for Step-by-Step Math Problem Mistake Finding by Prompt-Guided LLMs

Zhang, Tianyang, Jiang, Zhuoxuan, Zhang, Haotian, Lin, Lin, Zhang, Shaohua

arXiv.org Artificial IntelligenceMar-6-2025

We propose a novel system, MathMistake Checker, designed to automate step-by-step mistake finding in mathematical problems with lengthy answers through a two-stage process. The system aims to simplify grading, increase efficiency, and enhance learning experiences from a pedagogical perspective. It integrates advanced technologies, including computer vision and the chain-of-thought capabilities of the latest large language models (LLMs). Our system supports open-ended grading without reference answers and promotes personalized learning by providing targeted feedback. We demonstrate its effectiveness across various types of math problems, such as calculation and word problems.

artificial intelligence, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2503.04291

Country: Asia > China (0.18)

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards > Student Performance (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback