automatic grading
Ensemble ToT of LLMs and Its Application to Automatic Grading System for Supporting Self-Learning
Providing students with detailed and timely grading feedback is essential for self-learning. While existing LLM-based grading systems are promising, most of them rely on one single model, which limits their performance. To address this, we propose Ensemble Tree-of-Thought (ToT), a framework that enhances LLM outputs by integrating multiple models. Using this framework, we develop a grading system. Ensemble ToT follows three steps: (1) analyzing LLM performance, (2) generating candidate answers, and (3) refining them into a final result. Based on this, our grading system first evaluates the grading tendencies of LLMs, then generates multiple results, and finally integrates them via a simulated debate. Experimental results demonstrate our approach's ability to provide accurate and explainable grading by effectively coordinating multiple LLMs.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- (3 more...)
Automatic Grading of Knee Osteoarthritis on the Kellgren-Lawrence Scale from Radiographs Using Convolutional Neural Networks
Kondal, Sudeep, Kulkarni, Viraj, Gaikwad, Ashrika, Kharat, Amit, Pant, Aniruddha
The severity of knee osteoarthritis is graded using the 5-point Kellgren-Lawrence (KL) scale where healthy knees are assigned grade 0, and the subsequent grades 1-4 represent increasing severity of the affliction. Although several methods have been proposed in recent years to develop models that can automatically predict the KL grade from a given radiograph, most models have been developed and evaluated on datasets not sourced from India. These models fail to perform well on the radiographs of Indian patients. In this paper, we propose a novel method using convolutional neural networks to automatically grade knee radiographs on the KL scale. Our method works in two connected stages: in the first stage, an object detection model segments individual knees from the rest of the image; in the second stage, a regression model automatically grades each knee separately on the KL scale. We train our model using the publicly available Osteoarthritis Initiative (OAI) dataset and demonstrate that fine-tuning the model before evaluating it on a dataset from a private hospital significantly improves the mean absolute error from 1.09 (95% CI: 1.03-1.15) to 0.28 (95% CI: 0.25-0.32). Additionally, we compare classification and regression models built for the same task and demonstrate that regression outperforms classification.
- Asia > India (0.25)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Retail > Online (0.40)
- Information Technology > Services (0.40)