AITopics

Neural Information Processing SystemsJan-24-2025, 00:13:57 GMT

Review for NeurIPS paper: Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Clarity: The paper is overal clear and well written. I have a few suggestions to make it even easier to understand and/or fix some minor inconsistency. There is no need for the authors to answer to these points as I think the paper is already rather clear. I am unsure what Figure 1 represents. I might have missed it, but I think pi is not defined.

gaussian process, infinite mixture, task-agnostic online reinforcement learning, (3 more...)

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Neural Information Processing SystemsJan-24-2025, 00:13:50 GMT

Review for NeurIPS paper: Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Reviewers agreed the paper contains interesting and sound contributions to an important problem, and is generally well written, although the model is fairly complex and the experimental domains are a bit simple. The authors are encouraged to provide further details to justify/explain certain algorithmic choices, include some of the key derivation steps (maybe with details in the appendix), and augment the experiments (like those in the rebuttal).

gaussian process, infinite mixture, task-agnostic online reinforcement learning, (1 more...)

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Technology:

Information Technology > Modeling & Simulation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Spriggs, Kyle, Lau, Meng Cheng, Passi, Kalpdrum

Personalizing Education through an Adaptive LMS with Integrated LLMs

The widespread adoption of large language models (LLMs) marks a transformative era in technology, especially within the educational sector. This paper explores the integration of LLMs within learning management systems (LMSs) to develop an adaptive learning management system (ALMS) personalized for individual learners across various educational stages. Traditional LMSs, while facilitating the distribution of educational materials, fall short in addressing the nuanced needs of diverse student populations, particularly in settings with limited instructor availability. Our proposed system leverages the flexibility of AI to provide a customizable learning environment that adjusts to each user's evolving needs. By integrating a suite of general-purpose and domain-specific LLMs, this system aims to minimize common issues such as factual inaccuracies and outdated information, characteristic of general LLMs like OpenAI's ChatGPT. This paper details the development of an ALMS that not only addresses privacy concerns and the limitations of existing educational tools but also enhances the learning experience by maintaining engagement through personalized educational content.

large language model, machine learning, natural language, (19 more...)

2502.08655

Country:

North America > United States (0.04)
North America > Canada > Ontario > Thunder Bay District > Sudbury (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.46)
Research Report > New Finding (0.46)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Setting > Higher Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Wu, Fuping, Papiez, Bartlomiej W.

Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST

Foundation models are widely employed in medical image analysis, due to their high adaptability and generalizability for downstream tasks. With the increasing number of foundation models being released, model selection has become an important issue. In this work, we study the capabilities of foundation models in medical image classification tasks by conducting a benchmark study on the MedMNIST dataset. Specifically, we adopt various foundation models ranging from convolutional to Transformer-based models and implement both end-to-end training and linear probing for all classification tasks. The results demonstrate the significant potential of these pre-trained models when transferred for medical image classification. We further conduct experiments with different image sizes and various sizes of training data. By analyzing all the results, we provide preliminary, yet useful insights and conclusions on this topic.

large language model, machine learning, natural language, (18 more...)

2501.14685

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre:

Research Report > New Finding (0.88)
Instructional Material > Online (0.84)
Instructional Material > Course Syllabus & Notes (0.84)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education

Yeung, Calvin, Yu, Jeff, Cheung, King Chau, Wong, Tat Wing, Chan, Chun Man, Wong, Kin Chi, Fujii, Keisuke

Automated grading has become an essential tool in education technology due to its ability to efficiently assess large volumes of student work, provide consistent and unbiased evaluations, and deliver immediate feedback to enhance learning. However, current systems face significant limitations, including the need for large datasets in few-shot learning methods, a lack of personalized and actionable feedback, and an overemphasis on benchmark performance rather than student experience. To address these challenges, we propose a Zero-Shot Large Language Model (LLM)-Based Automated Assignment Grading (AAG) system. This framework leverages prompt engineering to evaluate both computational and explanatory student responses without requiring additional training or fine-tuning. The AAG system delivers tailored feedback that highlights individual strengths and areas for improvement, thereby enhancing student learning outcomes. Our study demonstrates the system's effectiveness through comprehensive evaluations, including survey responses from higher education students that indicate significant improvements in motivation, understanding, and preparedness compared to traditional grading methods. The results validate the AAG system's potential to transform educational assessment by prioritizing learning experiences and providing scalable, high-quality feedback.

large language model, machine learning, natural language, (17 more...)

2501.14305

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.05)
Asia > Japan > Honshū > Chūbu > Aichi Prefecture > Nagoya (0.04)
Africa > Middle East > Morocco (0.04)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Overview (0.93)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Assessment & Standards (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Poličar, Pavlin G., Špendl, Martin, Curk, Tomaž, Zupan, Blaž

Automated Assignment Grading with Large Language Models: Insights From a Bioinformatics Course

Providing students with individualized feedback through assignments is a cornerstone of education that supports their learning and development. Studies have shown that timely, high-quality feedback plays a critical role in improving learning outcomes. However, providing personalized feedback on a large scale in classes with large numbers of students is often impractical due to the significant time and effort required. Recent advances in natural language processing and large language models (LLMs) offer a promising solution by enabling the efficient delivery of personalized feedback. These technologies can reduce the workload of course staff while improving student satisfaction and learning outcomes. Their successful implementation, however, requires thorough evaluation and validation in real classrooms. We present the results of a practical evaluation of LLM-based graders for written assignments in the 2024/25 iteration of the Introduction to Bioinformatics course at the University of Ljubljana. Over the course of the semester, more than 100 students answered 36 text-based questions, most of which were automatically graded using LLMs. In a blind study, students received feedback from both LLMs and human teaching assistants without knowing the source, and later rated the quality of the feedback. We conducted a systematic evaluation of six commercial and open-source LLMs and compared their grading performance with human teaching assistants. Our results show that with well-designed prompts, LLMs can achieve grading accuracy and feedback quality comparable to human graders. Our results also suggest that open-source LLMs perform as well as commercial LLMs, allowing schools to implement their own grading systems while maintaining privacy.

large language model, machine learning, natural language, (20 more...)

2501.14499

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.25)
North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.51)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Emre, Taha, Araújo, Teresa, Oghbaie, Marzieh, Lachinov, Dmitrii, Aresta, Guilherme, Bogunović, Hrvoje

Automatic detection and prediction of nAMD activity change in retinal OCT using Siamese networks and Wasserstein Distance for ordinality

Neovascular age-related macular degeneration (nAMD) is a leading cause of vision loss among older adults, where disease activity detection and progression prediction are critical for nAMD management in terms of timely drug administration and improving patient outcomes. Recent advancements in deep learning offer a promising solution for predicting changes in AMD from optical coherence tomography (OCT) retinal volumes. In this work, we proposed deep learning models for the two tasks of the public MARIO Challenge at MICCAI 2024, designed to detect and forecast changes in nAMD severity with longitudinal retinal OCT. For the first task, we employ a Vision Transformer (ViT) based Siamese Network to detect changes in AMD severity by comparing scan embeddings of a patient from different time points. To train a model to forecast the change after 3 months, we exploit, for the first time, an Earth Mover (Wasserstein) Distance-based loss to harness the ordinal relation within the severity change classes. Both models ranked high on the preliminary leaderboard, demonstrating that their predictive capabilities could facilitate nAMD treatment management.

artificial intelligence, machine learning, prediction, (16 more...)

2501.14323

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre:

Research Report > Promising Solution (0.34)
Instructional Material > Course Syllabus & Notes (0.30)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images

Baral, Sami, Lucy, Li, Knight, Ryan, Ng, Alice, Soldaini, Luca, Heffernan, Neil T., Lo, Kyle

In real-world settings, vision language models (VLMs) should robustly handle naturalistic, noisy visual content as well as domain-specific language and concepts. For example, K-12 educators using digital learning platforms may need to examine and provide feedback across many images of students' math work. To assess the potential of VLMs to support educators in settings like this one, we introduce DrawEduMath, an English-language dataset of 2,030 images of students' handwritten responses to K-12 math problems. Teachers provided detailed annotations, including free-form descriptions of each image and 11,661 question-answer (QA) pairs. These annotations capture a wealth of pedagogical insights, ranging from students' problem-solving strategies to the composition of their drawings, diagrams, and writing. We evaluate VLMs on teachers' QA pairs, as well as 44,362 synthetic QA pairs derived from teachers' descriptions using language models (LMs). We show that even state-of-the-art VLMs leave much room for improvement on DrawEduMath questions. We also find that synthetic QAs, though imperfect, can yield similar model rankings as teacher-written QAs. We release DrawEduMath to support the evaluation of VLMs' abilities to reason mathematically over images gathered with educational contexts in mind.

large language model, machine learning, question answering, (23 more...)

2501.14877

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry:

Education > Educational Setting > Online (0.48)
Education > Curriculum > Subject-Specific Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

arXiv.org Artificial IntelligenceJan-23-2025

Auto-Evaluation: A Critical Measure in Driving Improvements in Quality and Safety of AI-Generated Lesson Resources

Clark, Hannah-Beth, Dowland, Margaux, Benton, Laura, Budai, Reka, Keskin, Ibrahim Kaan, Searle, Emma, Gregory, Matthew, Hodierne, Mark, Gayne, William, Roberts, John

As a publicly funded body in the UK, Oak National Academy is in a unique position to innovate within this field as we have a comprehensive curriculum of approximately 13,000 open education resources (OER) for all National Curriculum subjects, designed and quality-assured by expert, human teachers. This has provided the corpus of content needed for building a high-quality AI-powered lesson planning tool, Aila, that is free to use and, therefore, accessible to all teachers across the country. Furthermore, using our evidence-informed curriculum principles, we have codified and exemplified each component of lesson design. To assess the quality of lessons produced by Aila at scale, we have developed an AI-powered auto-evaluation agent,facilitating informed improvements to enhance output quality. Through comparisons between human and auto-evaluations, we have begun to refine this agent further to increase its accuracy, measured by its alignment with an expert human evaluator. In this paper we present this iterative evaluation process through an illustrative case study focused on one quality benchmark - the level of challenge within multiple-choice quizzes. We also explore the contribution that this may make to similar projects and the wider sector.

artificial intelligence, distractor, natural language, (9 more...)

2502.1041

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom > England (0.04)
Europe > France (0.04)

Genre:

Instructional Material (1.00)
Questionnaire & Opinion Survey (0.93)
Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.69)
Education > Curriculum (0.49)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.69)