Instructional Material
NVIDIA Nemotron Nano V2 VL
NVIDIA, null, :, null, Deshmukh, Amala Sanjay, Chumachenko, Kateryna, Rintamaki, Tuomas, Le, Matthieu, Poon, Tyler, Taheri, Danial Mohseni, Karmanov, Ilia, Liu, Guilin, Seppanen, Jarno, Chen, Guo, Sapra, Karan, Yu, Zhiding, Renduchintala, Adi, Wang, Charles, Jin, Peter, Goel, Arushi, Ranzinger, Mike, Voegtle, Lukas, Fischer, Philipp, Roman, Timo, Ping, Wei, Wang, Boxin, Yang, Zhuolin, Lee, Nayeon, Zhang, Shaokun, Liu, Fuxiao, Li, Zhiqi, Zhang, Di, Heinrich, Greg, Yin, Hongxu, Han, Song, Molchanov, Pavlo, Mannan, Parth, Xu, Yao, Scowcroft, Jane Polak, Balough, Tom, Radhakrishnan, Subhashree, Zhang, Paris, Cha, Sean, Kumar, Ratnesh, Bhat, Zaid Pervaiz, Zhang, Jian, Hanley, Darragh, Biswas, Pritam, Oliver, Jesse, Vasques, Kevin, Waleffe, Roger, Riach, Duncan, Olabiyi, Oluwatobi, Mahabaleshwarkar, Ameya Sunil, Kartal, Bilal, Gundecha, Pritam, Nguyen, Khanh, Milesi, Alexandre, Khvedchenia, Eugene, Zilberstein, Ran, Masad, Ofri, Bagrov, Natan, Assaf, Nave, Asida, Tomer, Afrimi, Daniel, Zuker, Amit, Haber, Netanel, Cheng, Zhiyu, Xin, Jingyu, Wu, Di, Spirin, Nik, Moosaei, Maryam, Ageev, Roman, Shah, Vanshil Atul, Wu, Yuting, Korzekwa, Daniel, Sreekumar, Unnikrishnan Kizhakkemadam, Jiang, Wanli, Subramanian, Padmavathy, Rico, Alejandra, Bhaskar, Sandip, Motiian, Saeid, Wu, Kedi, Surla, Annie, Chen, Chia-Chih, Wolff, Hayden, Feinberg, Matthew, Corpuz, Melissa, Wawrzos, Marek, Long, Eileen, Jhunjhunwala, Aastha, Hendricks, Paul, Memarian, Farzan, Hall, Benika, Wang, Xin-Yu, Mosallanezhad, David, Singhal, Soumye, Vega, Luis, Cheung, Katherine, Pawelec, Krzysztof, Evans, Michael, Luna, Katherine, Lou, Jie, Galinkin, Erick, Hazare, Akshay, Purandare, Kaustubh, Guan, Ann, Warno, Anna, Cui, Chen, Suhara, Yoshi, Likhite, Shibani, Mard, Seph, Price, Meredith, Sleiman, Laya, Kaji, Saori, Karpas, Udi, Briski, Kari, Conway, Joey, Lightstone, Michael, Kautz, Jan, Shoeybi, Mohammad, Patwary, Mostofa, Cohen, Jonathen, Kuchaiev, Oleksii, Tao, Andrew, Catanzaro, Bryan
We introduce Nemotron Nano V2 VL, the latest model of the Nemotron vision-language series designed for strong real-world document understanding, long video comprehension, and reasoning tasks. Nemotron Nano V2 VL delivers significant improvements over our previous model, Llama-3.1-Nemotron-Nano-VL-8B, across all vision and text domains through major enhancements in model architecture, datasets, and training recipes. Nemotron Nano V2 VL builds on Nemotron Nano V2, a hybrid Mamba-Transformer LLM, and innovative token reduction techniques to achieve higher inference throughput in long document and video scenarios. We are releasing model checkpoints in BF16, FP8, and FP4 formats and sharing large parts of our datasets, recipes and training code.
MazeMate: An LLM-Powered Chatbot to Support Computational Thinking in Gamified Programming Learning
Hou, Chenyu, Yu, Hua, Zhu, Gaoxia, Anas, John Derek, Liu, Jiao, Ong, Yew Soon
Computational Thinking (CT) is a foundational problem-solving skill, and gamified programming environments are a widely adopted approach to cultivating it. While large language models (LLMs) provide on-demand programming support, current applications rarely foster CT development. We present MazeMate, an LLM-powered chatbot embedded in a 3D Maze programming game, designed to deliver adaptive, context-sensitive scaffolds aligned with CT processes in maze solving and maze design. We report on the first classroom implementation with 247 undergraduates. Students rated MazeMate as moderately helpful, with higher perceived usefulness for maze solving than for maze design. Thematic analysis confirmed support for CT processes such as decomposition, abstraction, and algorithmic thinking, while also revealing limitations in supporting maze design, including mismatched suggestions and fabricated algorithmic solutions. These findings demonstrate the potential of LLM-based scaffolding to support CT and underscore directions for design refinement to enhance MazeMate usability in authentic classrooms.
Multi-Method Analysis of Mathematics Placement Assessments: Classical, Machine Learning, and Clustering Approaches
Allagan, Julian D., Singleton, Dasia A., Perry, Shanae N., Morgan, Gabrielle C., Morgan, Essence A.
This study evaluates a 40-item mathematics placement examination administered to 198 students using a multi-method framework combining Classical Test Theory, machine learning, and unsupervised clustering. Classical Test Theory analysis reveals that 55\% of items achieve excellent discrimination ($D \geq 0.40$) while 30\% demonstrate poor discrimination ($D < 0.20$) requiring replacement. Question 6 (Graph Interpretation) emerges as the examination's most powerful discriminator, achieving perfect discrimination ($D = 1.000$), highest ANOVA F-statistic ($F = 4609.1$), and maximum Random Forest feature importance (0.206), accounting for 20.6\% of predictive power. Machine learning algorithms demonstrate exceptional performance, with Random Forest and Gradient Boosting achieving 97.5\% and 96.0\% cross-validation accuracy. K-means clustering identifies a natural binary competency structure with a boundary at 42.5\%, diverging from the institutional threshold of 55\% and suggesting potential overclassification into remedial categories. The two-cluster solution exhibits exceptional stability (bootstrap ARI = 0.855) with perfect lower-cluster purity. Convergent evidence across methods supports specific refinements: replace poorly discriminating items, implement a two-stage assessment, and integrate Random Forest predictions with transparency mechanisms. These findings demonstrate that multi-method integration provides a robust empirical foundation for evidence-based mathematics placement optimization.
Transforming Mentorship: An AI Powered Chatbot Approach to University Guidance
Rahman, Mashrur, abedin, Mantaqa, Abir, Monowar Zamil, Ansari, Faizul Islam, Reza, Adib, Sadeque, Farig Yousuf, Farhan, Niloy
Abstract--University students face immense challenges during their undergraduate lives, often being deprived of personalized on-demand guidance that mentors fail to provide at scale. Digital tools exist, but there is a serious lack of customized coaching for newcomers. This paper presents an AI-powered chatbot that will serve as a mentor for the students of BRAC University. The main component is a data ingestion pipeline that efficiently processes and updates information from diverse sources, such as CSV files and university webpages. The chatbot retrieves information through a hybrid approach, combining BM25 lexical ranking with ChromaDB semantic retrieval, and uses a Large Language Model, LLaMA-3.3-70B, to generate conversational responses. The generated text was found to be semantically highly relevant, with a BERTScore of 0.831 and a METEOR score of 0.809. The data pipeline was also very efficient, taking 106.82 seconds for updates, compared to 368.62 seconds for new data. This chatbot will be able to help students by responding to their queries, helping them to get a better understanding of university life, and assisting them to plan better routines for their semester in the open-credit university. Due to the dynamic academic environment, large number of students with fewer faculties and staffs, and difficult university program policies and procedures, challenges were present throughout the four years of university education. Open credit universities face challenges in obtaining accurate policy information, selecting appropriate courses, scheduling classes, and managing limited time with mentors due to mentor shortages. Technology has given students many resources, but on-demand and personal help is still lacking. This is especially risky for first-year students who sometimes struggle with the new environment and may need additional guidance. To fill this gap, we will provide a corpus-based chatbot that also serves as a student companion.
Scaffolding Metacognition in Programming Education: Understanding Student-AI Interactions and Design Implications
Ma, Boxuan, Li, Huiyong, Li, Gen, Chen, Li, Tang, Cheng, Xie, Yinjie, Gu, Chenghao, Shimada, Atsushi, Konomi, Shin'ichi
Generative AI tools such as ChatGPT now provide novice programmers with unprecedented access to instant, personalized support. While this holds clear promise, their influence on students' metacognitive processes remains underexplored. Existing work has largely focused on correctness and usability, with limited attention to whether and how students' use of AI assistants supports or bypasses key metacognitive processes. This study addresses that gap by analyzing student-AI interactions through a metacognitive lens in university-level programming courses. We examined more than 10,000 dialogue logs collected over three years, complemented by surveys of students and educators. Our analysis focused on how prompts and responses aligned with metacognitive phases and strategies. Synthesizing these findings across data sources, we distill design considerations for AI-powered coding assistants that aim to support rather than supplant metacognitive engagement. Our findings provide guidance for developing educational AI tools that strengthen students' learning processes in programming education.
Learning from Online Videos at Inference Time for Computer-Use Agents
Liu, Yujian, Wang, Ze, Chen, Hao, Sun, Ximeng, Yu, Xiaodong, Wu, Jialian, Liu, Jiang, Barsoum, Emad, Liu, Zicheng, Chang, Shiyu
Computer-use agents can operate computers and automate laborious tasks, but despite recent rapid progress, they still lag behind human users, especially when tasks require domain-specific procedural knowledge about particular applications, platforms, and multi-step workflows. Humans can bridge this gap by watching video tutorials: we search, skim, and selectively imitate short segments that match our current subgoal. In this paper, we study how to enable computer-use agents to learn from online videos at inference time effectively. We propose a framework that retrieves and filters tutorial videos, converts them into structured demonstration trajectories, and dynamically selects trajectories as in-context guidance during execution. Particularly, using a VLM, we infer UI actions, segment videos into short subsequences of actions, and assign each subsequence a textual objective. At inference time, a two-stage selection mechanism dynamically chooses a single trajectory to add in context at each step, focusing the agent on the most helpful local guidance for its next decision. Experiments on two widely used benchmarks show that our framework consistently outperforms strong base agents and variants that use only textual tutorials or transcripts. Analyses highlight the importance of trajectory segmentation and selection, action filtering, and visual information, suggesting that abundant online videos can be systematically distilled into actionable guidance that improves computer-use agents at inference time. Our code is available at https://github.com/UCSB-NLP-Chang/video_demo.
Measuring Teaching with LLMs
Objective and scalable measurement of teaching quality is a persistent challenge in education. While Large Language Models (LLMs) offer potential, general-purpose models have struggled to reliably apply complex, authentic classroom observation instruments. This paper uses custom LLMs built on sentence-level embeddings, an architecture better suited for the long-form, interpretive nature of classroom transcripts than conventional subword tokenization. We systematically evaluate five different sentence embeddings under a data-efficient training regime designed to prevent overfitting. Our results demonstrate that these specialized models can achieve human-level and even super-human performance with expert human ratings above 0.65 and surpassing the average human-human rater correlation. Further, through analysis of annotation context windows, we find that more advanced models-those better aligned with human judgments-attribute a larger share of score variation to lesson-level features rather than isolated utterances, challenging the sufficiency of single-turn annotation paradigms. Finally, to assess external validity, we find that aggregate model scores align with teacher value-added measures, indicating they are capturing features relevant to student learning. However, this trend does not hold at the individual item level, suggesting that while the models learn useful signals, they have not yet achieved full generalization. This work establishes a viable and powerful new methodology for AI-driven instructional measurement, offering a path toward providing scalable, reliable, and valid feedback for educator development.
HACI: A Haptic-Audio Code Interface to Improve Educational Outcomes for Visually Impaired Introductory Programming Students
This thesis introduces the Haptic-Audio Code Interface (HACI), an educational tool designed to enhance programming education for visually impaired (VI) students by integrating haptic and audio feedback to compensate for the absence of visual cues. HACI consists of a non-resource-intensive web application supporting JavaScript program development, execution, and debugging, connected via a cable to an Arduino-powered glove with six integrated haptic motors to provide physical feedback to VI programmers. Motivated by the need to provide equitable educational opportunities in computer science, HACI aims to improve non-visual code navigation, comprehension, summarizing, editing, and debugging for students with visual impairments while minimizing cognitive load. This work details HACI's design principles, technical implementation, and a preliminary evaluation through a pilot study conducted with undergraduate Computer Science students. Findings indicate that HACI aids in the non-visual navigation and understanding of programming constructs, although challenges remain in refining feedback mechanisms to ensure consistency and reliability, as well as supplementing the current functionality with a more feature-reach and customizable accessible learning experience which will allow visually impaired students to fully utilize interleaved haptic and audio feedback. The study underscores the transformative potential of haptic and audio feedback in educational practices for the visually impaired, setting a foundation for future research and development in accessible programming education. This thesis contributes to the field of accessible technology by demonstrating how tactile and auditory feedback can be effectively integrated into educational tools, thereby broadening accessibility in STEM education.
AI Song Contest โ vote for your favourite
The AI Song Contest was founded with the aim of showcasing the potential of human-AI co-creativity in the songwriting process. Now in its sixth year, the competition will conclude on 16 November with a live show in Amsterdam. From all the entrants, the jury have selected their top ten songs. The live event will feature performances from the ten finalists, and you will be able to watch on YouTube here . Listen to the songs and vote for your favourite.
AURA: Autonomous Upskilling with Retrieval-Augmented Agents
Zhu, Alvin, Tanaka, Yusuke, Goldberg, Andrew, Hong, Dennis
Designing reinforcement learning curricula for agile robots traditionally requires extensive manual tuning of reward functions, environment randomizations, and training configurations. We introduce AURA (Autonomous Upskilling with Retrieval-Augmented Agents), a schema-validated curriculum reinforcement learning (RL) framework that leverages Large Language Models (LLMs) as autonomous designers of multi-stage curricula. AURA transforms user prompts into YAML workflows that encode full reward functions, domain randomization strategies, and training configurations. All files are statically validated before any GPU time is used, ensuring efficient and reliable execution. A retrieval-augmented feedback loop allows specialized LLM agents to design, execute, and refine curriculum stages based on prior training results stored in a vector database, enabling continual improvement over time. Quantitative experiments show that AURA consistently outperforms LLM-guided baselines in generation success rate, humanoid locomotion, and manipulation tasks. Ablation studies highlight the importance of schema validation and retrieval for curriculum quality. AURA successfully trains end-to-end policies directly from user prompts and deploys them zero-shot on a custom humanoid robot in multiple environments - capabilities that did not exist previously with manually designed controllers. By abstracting the complexity of curriculum design, AURA enables scalable and adaptive policy learning pipelines that would be complex to construct by hand. Project page: https://aura-research.org/