Instructional Material
From Objectives to Questions: A Planning-based Framework for Educational Mathematical Question Generation
Cheng, Cheng, Huang, Zhenya, Zhao, Guanhao, Guo, Yuxiang, Lin, Xin, Wu, Jinze, Li, Xin, Wang, Shijin
Automatically generating high-quality mathematical problems that align with educational objectives is a crucial task in NLP-based educational technology. Traditional generation methods focus primarily on textual quality, but they often overlook educational objectives. Moreover, these methods address only single-dimensional, simple question generation, failing to meet complex, multifaceted educational requirements. To address these challenges, we constructed and annotated EduMath, a dataset of 16k mathematical questions with multi-dimensional educational objectives. Based on this dataset, we developed EQGEVAL, which incorporates three evaluation dimensions and is designed to assess the ability of models to generate educational questions. Drawing inspiration from teachers' problem design processes, we propose the Educational Question Planning with self-Reflection (EQPR) method for educational mathematical question generation, following a "plan-evaluate-optimize" approach. Specifically, by combining planning algorithm based on Monte Carlo Tree Search with the generative capabilities of Large Language Models, we continuously optimize questions through iterative feedback. This self-optimization mechanism ensures that the generated questions both fit the educational context and strategically achieve specific basic educational objectives. Through extensive experiments based on EQGEVAL, we have demonstrated that EQPR achieves significant improvements in generating questions that meet multi-dimensional educational objectives.
Adaptive-VP: A Framework for LLM-Based Virtual Patients that Adapts to Trainees' Dialogue to Facilitate Nurse Communication Training
Lee, Keyeun, Lee, Seolhee, Kim, Esther Hehsun, Ko, Yena, Eun, Jinsu, Kim, Dahee, Cho, Hyewon, Zhu, Haiyi, Kraut, Robert E., Suh, Eunyoung, Kim, Eun-mee, Lim, Hajin
Effective communication training is essential to preparing nurses for high-quality patient care. While standardized patient (SP) simulations provide valuable experiential learning, they are often costly and inflexible. Virtual patient (VP) systems offer a scalable alternative, but most fail to adapt to the varying communication skills of trainees. In particular, when trainees respond ineffectively, VPs should escalate in hostility or become uncooperative--yet this level of adaptive interaction remains largely unsupported. To address this gap, we introduce Adaptive-VP, a VP dialogue generation framework that leverages large language models (LLMs) to dynamically adapt VP behavior based on trainee input. The framework features a pipeline for constructing clinically grounded yet flexible VP scenarios and a modular system for assessing trainee communication and adjusting VP responses in real time, while ensuring learner safety. We validated Adaptive-VP by simulating challenging patient conversations. Automated evaluation using a corpus from practicing nurses showed that our communication skill evaluation mechanism reflected real-world proficiency levels. Expert nurses further confirmed that Adaptive-VP produced more natural and realistic interactions than existing approaches, demonstrating its potential as a scalable and effective tool for nursing communication training.
Direct Feedback Alignment Provides Learning in Deep Neural Networks
Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently discovered method called feedback-alignment shows that the weights used for propagating the error backward don't have to be symmetric with the weights used for propagation the activation forward. In fact, random feedback weights work evenly well, because the network learns how to make the feedback useful. In this work, the feedback alignment principle is used for training hidden layers more independently from the rest of the network, and from a zero initial condition. The error is propagated through fixed random feedback connections directly from the output layer to each hidden layer. This simple method is able to achieve zero training error even in convolutional networks and very deep networks, completely without error backpropagation. The method is a step towards biologically plausible machine learning because the error signal is almost local, and no symmetric or reciprocal weights are required. Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. If combined with dropout, the method achieves 1.45% error on the permutation invariant MNIST task.
Statistical mechanics of extensive-width Bayesian neural networks near interpolation
Barbier, Jean, Camilli, Francesco, Nguyen, Minh-Toan, Pastore, Mauro, Skerk, Rudy
For three decades statistical mechanics has been providing a framework to analyse neural networks. However, the theoretically tractable models, e.g., perceptrons, random features models and kernel machines, or multi-index models and committee machines with few neurons, remained simple compared to those used in applications. In this paper we help reducing the gap between practical networks and their theoretical understanding through a statistical physics analysis of the supervised learning of a two-layer fully connected network with generic weight distribution and activation function, whose hidden layer is large but remains proportional to the inputs dimension. This makes it more realistic than infinitely wide networks where no feature learning occurs, but also more expressive than narrow ones or with fixed inner weights. We focus on the Bayes-optimal learning in the teacher-student scenario, i.e., with a dataset generated by another network with the same architecture. We operate around interpolation, where the number of trainable parameters and of data are comparable and feature learning emerges. Our analysis uncovers a rich phenomenology with various learning transitions as the number of data increases. In particular, the more strongly the features (i.e., hidden neurons of the target) contribute to the observed responses, the less data is needed to learn them. Moreover, when the data is scarce, the model only learns non-linear combinations of the teacher weights, rather than "specialising" by aligning its weights with the teacher's. Specialisation occurs only when enough data becomes available, but it can be hard to find for practical training algorithms, possibly due to statistical-to-computational~gaps.
HESEIA: A community-based dataset for evaluating social biases in large language models, co-designed in real school settings in Latin America
Ivetta, Guido, Gomez, Marcos J., Martinelli, Sofรญa, Palombini, Pietro, Echeveste, M. Emilia, Mazzeo, Nair Carolina, Busaniche, Beatriz, Benotti, Luciana
Most resources for evaluating social biases in Large Language Models are developed without co-design from the communities affected by these biases, and rarely involve participatory approaches. We introduce HESEIA, a dataset of 46,499 sentences created in a professional development course. The course involved 370 high-school teachers and 5,370 students from 189 Latin-American schools. Unlike existing benchmarks, HESEIA captures intersectional biases across multiple demographic axes and school subjects. It reflects local contexts through the lived experience and pedagogical expertise of educators. Teachers used minimal pairs to create sentences that express stereotypes relevant to their school subjects and communities. We show the dataset diversity in term of demographic axes represented and also in terms of the knowledge areas included. We demonstrate that the dataset contains more stereotypes unrecognized by current LLMs than previous datasets. HESEIA is available to support bias assessments grounded in educational communities.
Generative Knowledge Production Pipeline Driven by Academic Influencers
Feher, Katalin, Demeter, Marton
ABSTRACT Generative AI transforms knowledge production, validation, and dissemination, raising academic integrity and credibility concerns. This study examines 53 academic influencer videos that reached 5.3 million viewers to identify an emerging, structured, implementation-ready pipeline balancing originality, ethical compliance, and human-AI collaboration despite the disruptive impacts. Findings highlight generative AI's potential to automate publication workflows and democratize participation in knowledge production while challenging traditional scientific norms. Academic influencers emerge as key intermediaries in this paradigm shift, connecting bottom-up practices with institutional policies to improve adaptability. Accordingly, the study proposes a generative publication production pipeline and a policy framework for co-intelligence adaptation and reinforcing credibility-centered standards in AI-powered research. These insights support scholars, educators, and policymakers in understanding AI's transformative impact by advocating responsible and innovation-driven knowledge production. Additionally, they reveal pathways for automating best practices, optimizing scholarly workflows, and fostering creativity in academic research and publication. Keywords: generative AI, ChatPGT, academic integrity, influencers, knowledge production, social media, policy implications, academic policy 1. INTRODUCTION The advent of generative AI (GenAI) transforms knowledge production, increasingly supporting and partially automating the academic workflow (Bolanos et al. 2024). This trend suggests a paradigm shift where researchers utilize effectively and productively generative AI tools, potentially leading to more automated scientific workflows. However, we have also identified a human component in this process: the impact of the academic influencers via social media promoting hands-on knowledge about GenAI in academic projects.
RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation
Xie, Zhentao, Han, Chengcheng, Shi, Jinxin, Cui, Wenjun, Zhao, Xin, Wu, Xingjiao, Zhao, Jiabao
Although multi-agent systems based on large language models show strong capabilities on multiple tasks, they are still limited by high computational overhead, information loss, and robustness. Inspired by ResNet's residual learning, we propose Residual Mixture-of-Agents (RMoA), integrating residual connections to optimize efficiency and reliability. To maximize information utilization from model responses while minimizing computational costs, we innovatively design an embedding-based diversity selection mechanism that greedily selects responses via vector similarity. Furthermore, to mitigate iterative information degradation, we introduce a Residual Extraction Agent to preserve cross-layer incremental information by capturing inter-layer response differences, coupled with a Residual Aggregation Agent for hierarchical information integration. Additionally, we propose an adaptive termination mechanism that dynamically halts processing based on residual convergence, further improving inference efficiency. RMoA achieves state-of-the-art performance on the benchmarks of across alignment, mathematical reasoning, code generation, and multitasking understanding, while significantly reducing computational overhead. Code is available at https://github.com/mindhunter01/RMoA.
ZeroGUI: Automating Online GUI Learning at Zero Human Cost
Yang, Chenyu, Su, Shiqian, Liu, Shi, Dong, Xuan, Yu, Yue, Su, Weijie, Wang, Xuehui, Liu, Zhaoyang, Zhu, Jinguo, Li, Hao, Wang, Wenhai, Qiao, Yu, Zhu, Xizhou, Dai, Jifeng
The rapid advancement of large Vision-Language Models (VLMs) has propelled the development of pure-vision-based GUI Agents, capable of perceiving and operating Graphical User Interfaces (GUI) to autonomously fulfill user instructions. However, existing approaches usually adopt an offline learning framework, which faces two core limitations: (1) heavy reliance on high-quality manual annotations for element grounding and action supervision, and (2) limited adaptability to dynamic and interactive environments. To address these limitations, we propose ZeroGUI, a scalable, online learning framework for automating GUI Agent training at Zero human cost. Specifically, ZeroGUI integrates (i) VLM-based automatic task generation to produce diverse training goals from the current environment state, (ii) VLM-based automatic reward estimation to assess task success without hand-crafted evaluation functions, and (iii) two-stage online reinforcement learning to continuously interact with and learn from GUI environments. Experiments on two advanced GUI Agents (UI-TARS and Aguvis) demonstrate that ZeroGUI significantly boosts performance across OSWorld and AndroidLab environments. The code is available at https://github.com/OpenGVLab/ZeroGUI.
Self-Adapting Drones for Unpredictable Worlds
Register now free-of-charge to explore this white paper How Embodied Intelligence Enhances the Safety, Resilience, and Autonomy of UAV Systems As drones evolve into critical agents across defense, disaster response, and infrastructure inspection, they must become more adaptive, secure, and resilient. Traditional AI methods fall short in real-world unpredictability. This whitepaper from the Technology Innovation Institute (TII) explores how Embodied AI - AI that integrates perception, action, memory, and learning in dynamic environments, can revolutionize drone operations. Drawing from innovations in GenAI, Physical AI, and zero-trust frameworks, TII outlines a future where drones can perceive threats, adapt to change, and collaborate safely in real time. The result: smarter, safer, and more secure autonomous aerial systems. What Attendees will Learn: Why Embodied AI Outperforms Traditional AI The 4 Pillars of Drone Intelligence Swarm Resilience in Dynamic Environments Security Breakthroughs for Critical Missions Click on the cover to download the white paper PDF now.
From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
Ji, Haonian, Qiu, Shi, Xin, Siyang, Han, Siwei, Chen, Zhaorun, Zhang, Dake, Wang, Hongyi, Yao, Huaxiu
While foundation models (FMs), such as diffusion models and large vision-language models (LVLMs), have been widely applied in educational contexts, their ability to generate pedagogically effective visual explanations remains limited. Most existing approaches focus primarily on textual reasoning, overlooking the critical role of structured and interpretable visualizations in supporting conceptual understanding. To better assess the visual reasoning capabilities of FMs in educational settings, we introduce EduVisBench, a multi-domain, multi-level benchmark. EduVisBench features diverse STEM problem sets requiring visually grounded solutions, along with a fine-grained evaluation rubric informed by pedagogical theory. Our empirical analysis reveals that existing models frequently struggle with the inherent challenge of decomposing complex reasoning and translating it into visual representations aligned with human cognitive processes. To address these limitations, we propose EduVisAgent, a multi-agent collaborative framework that coordinates specialized agents for instructional planning, reasoning decomposition, metacognitive prompting, and visualization design. Experimental results show that EduVisAgent substantially outperforms all baselines, achieving a 40.2% improvement and delivering more educationally aligned visualizations. EduVisBench and EduVisAgent are available at https://github.com/aiming-lab/EduVisBench and https://github.com/aiming-lab/EduVisAgent.