Instructional Material
PVBF: A Framework for Mitigating Parameter Variation Imbalance in Online Continual Learning
Tao, Zelin, Deng, Hao, Liu, Mingqing, Zhang, Lijun, Zhao, Shengjie
Online continual learning (OCL), which enables AI systems to adaptively learn from non-stationary data streams, is commonly achieved using experience replay (ER)-based methods that retain knowledge by replaying stored past during training. However, these methods face challenges of prediction bias, stemming from deviations in parameter update directions during task transitions. This paper identifies parameter variation imbalance as a critical factor contributing to prediction bias in ER-based OCL. Specifically, using the proposed parameter variation evaluation method, we highlight two types of imbalance: correlation-induced imbalance, where certain parameters are disproportionately updated across tasks, and layer-wise imbalance, where output layer parameters update faster than those in preceding layers. To mitigate the above imbalances, we propose the Parameter Variation Balancing Framework (PVBF), which incorporates: 1) a novel method to compute parameter correlations with previous tasks based on parameter variations, 2) an encourage-and-consolidate (E&C) method utilizing parameter correlations to perform gradient adjustments across all parameters during training, 3) a dual-layer copy weights with reinit (D-CWR) strategy to slowly update output layer parameters for frequently occuring sample categories. Experiments on short and long task sequences demonstrate that PVBF significantly reduces prediction bias and improves OCL performance, achieving up to 47\% higher accuracy compared to existing ER-based methods.
Knowledge Distillation with Training Wheels
Liu, Guanlin, Ramachandran, Anand, Gangwani, Tanmay, Fu, Yan, Sethy, Abhinav
Knowledge distillation is used, in generative language modeling, to train a smaller student model using the help of a larger teacher model, resulting in improved capabilities for the student model. In this paper, we formulate a more general framework for knowledge distillation where the student learns from the teacher during training, and also learns to ask for the teacher's help at test-time following rules specifying test-time restrictions. Towards this, we first formulate knowledge distillation as an entropy-regularized value optimization problem. Adopting Path Consistency Learning to solve this, leads to a new knowledge distillation algorithm using on-policy and off-policy demonstrations. We extend this using constrained reinforcement learning to a framework that incorporates the use of the teacher model as a test-time reference, within constraints. In this situation, akin to a human learner, the model needs to learn not only the learning material, but also the relative difficulty of different sections to prioritize for seeking teacher help. We examine the efficacy of our method through experiments in translation and summarization tasks, observing trends in accuracy and teacher use, noting that our approach unlocks operating points not available to the popular Speculative Decoding approach.
What is a Good Question? Utility Estimation with LLM-based Simulations
Lee, Dong-Ho, Cho, Hyundong, May, Jonathan, Pujara, Jay
Asking questions is a fundamental aspect of learning that facilitates deeper understanding. However, characterizing and crafting questions that effectively improve learning remains elusive. To address this gap, we propose QUEST (Question Utility Estimation with Simulated Tests). QUEST simulates a learning environment that enables the quantification of a question's utility based on its direct impact on improving learning outcomes. Furthermore, we can identify high-utility questions and use them to fine-tune question generation models with rejection sampling. We find that questions generated by models trained with rejection sampling based on question utility result in exam scores that are higher by at least 20% than those from specialized prompting grounded on educational objectives literature and models fine-tuned with indirect measures of question quality, such as saliency and expected information gain.
Improved Diffusion-based Generative Model with Better Adversarial Robustness
Wang, Zekun, Yi, Mingyang, Xue, Shuchen, Li, Zhenguo, Liu, Ming, Qin, Bing, Ma, Zhi-Ming
Diffusion Probabilistic Models (DPMs) have achieved significant success in generative tasks. However, their training and sampling processes suffer from the issue of distribution mismatch. During the denoising process, the input data distributions differ between the training and inference stages, potentially leading to inaccurate data generation. To obviate this, we analyze the training objective of DPMs and theoretically demonstrate that this mismatch can be alleviated through Distributionally Robust Optimization (DRO), which is equivalent to performing robustness-driven Adversarial Training (AT) on DPMs. Furthermore, for the recently proposed Consistency Model (CM), which distills the inference process of the DPM, we prove that its training objective also encounters the mismatch issue. Fortunately, this issue can be mitigated by AT as well. Based on these insights, we propose to conduct efficient AT on both DPM and CM. Finally, extensive empirical studies validate the effectiveness of AT in diffusion-based models. The code is available at https://github.com/kugwzk/AT_Diff.
AUKT: Adaptive Uncertainty-Guided Knowledge Transfer with Conformal Prediction
Liu, Rui, Gao, Peng, Shen, Yu, Lin, Ming, Tokekar, Pratap
Knowledge transfer between teacher and student models has proven effective across various machine learning applications. However, challenges arise when the teacher's predictions are noisy, or the data domain during student training shifts from the teacher's pretraining data. In such scenarios, blindly relying on the teacher's predictions can lead to suboptimal knowledge transfer. To address these challenges, we propose a novel and universal framework, Adaptive Uncertainty-guided Knowledge Transfer ($\textbf{AUKT}$), which leverages Conformal Prediction (CP) to dynamically adjust the student's reliance on the teacher's guidance based on the teacher's prediction uncertainty. CP is a distribution-free, model-agnostic approach that provides reliable prediction sets with statistical coverage guarantees and minimal computational overhead. This adaptive mechanism mitigates the risk of learning undesirable or incorrect knowledge. We validate the proposed framework across diverse applications, including image classification, imitation-guided reinforcement learning, and autonomous driving. Experimental results consistently demonstrate that our approach improves performance, robustness and transferability, offering a promising direction for enhanced knowledge transfer in real-world applications.
Optimizing Retrieval-Augmented Generation of Medical Content for Spaced Repetition Learning
Kaczmarek, Jeremi I., Pokrywka, Jakub, Biedalak, Krzysztof, Kurzyp, Grzegorz, Grzybowski, Łukasz
Advances in Large Language Models revolutionized medical education by enabling scalable and efficient learning solutions. This paper presents a pipeline employing Retrieval-Augmented Generation (RAG) system to prepare comments generation for Poland's State Specialization Examination (PES) based on verified resources. The system integrates these generated comments and source documents with a spaced repetition learning algorithm to enhance knowledge retention while minimizing cognitive overload. By employing a refined retrieval system, query rephraser, and an advanced reranker, our modified RAG solution promotes accuracy more than efficiency. Rigorous evaluation by medical annotators demonstrates improvements in key metrics such as document relevance, credibility, and logical coherence of generated content, proven by a series of experiments presented in the paper. This study highlights the potential of RAG systems to provide scalable, high-quality, and individualized educational resources, addressing non-English speaking users.
Intelligent Tutors Beyond K-12: An Observational Study of Adult Learner Engagement and Academic Impact
Gupta, Adit, MacLellan, Christopher
Intelligent tutors have proven to be effective in K-12 education, though their impact on adult learners -- especially as a supplementary resource -- remains underexplored. Understanding how adults voluntarily engage with educational technologies can inform the design of tools that support skill re-learning and enhancement. More critically, it helps determine whether tutoring systems, which are typically built for K-12 learners, can also support adult populations. This study examines the adoption, usage patterns, and effectiveness of a novel tutoring system, Apprentice Tutors, among adult learners at a state technical college. We analyze three types of data including, user demographics, grades, and tutor interactions, to assess whether voluntary tutor usage translates into measurable learning gains. Our findings reveal key temporal patterns in tutor engagement and provide evidence of learning within tutors, as determined through skill improvement in knowledge components across tutors. We also found evidence that this learning transferred outside the tutor, as observed through higher course assessment scores following tutor usage. These results suggest that intelligent tutors are a viable tool for adult learners, warranting further research into their long-term impact on this population.
A Review of Causal Decision Making
Ge, Lin, Cai, Hengrui, Wan, Runzhe, Xu, Yang, Song, Rui
To make effective decisions, it is important to have a thorough understanding of the causal relationships among actions, environments, and outcomes. This review aims to surface three crucial aspects of decision-making through a causal lens: 1) the discovery of causal relationships through causal structure learning, 2) understanding the impacts of these relationships through causal effect learning, and 3) applying the knowledge gained from the first two aspects to support decision making via causal policy learning. Moreover, we identify challenges that hinder the broader utilization of causal decision-making and discuss recent advances in overcoming these challenges. Finally, we provide future research directions to address these challenges and to further enhance the implementation of causal decision-making in practice, with real-world applications illustrated based on the proposed causal decision-making. We aim to offer a comprehensive methodology and practical implementation framework by consolidating various methods in this area into a Python-based collection. URL: https://causaldm.github.io/Causal-Decision-Making.
LitLinker: Supporting the Ideation of Interdisciplinary Contexts with Large Language Models for Teaching Literature in Elementary Schools
Fan, Haoxiang, Zhou, Changshuang, Yu, Hao, Wu, Xueyang, Gu, Jiangyu, Peng, Zhenhui
Teaching literature under interdisciplinary contexts (e.g., science, art) that connect reading materials has become popular in elementary schools. However, constructing such contexts is challenging as it requires teachers to explore substantial amounts of interdisciplinary content and link it to the reading materials. In this paper, we develop LitLinker via an iterative design process involving 13 teachers to facilitate the ideation of interdisciplinary contexts for teaching literature. Powered by a large language model (LLM), LitLinker can recommend interdisciplinary topics and contextualize them with the literary elements (e.g., paragraphs, viewpoints) in the reading materials. A within-subjects study (N=16) shows that compared to an LLM chatbot, LitLinker can improve the integration depth of different subjects and reduce workload in this ideation task. Expert interviews (N=9) also demonstrate LitLinker's usefulness for supporting the ideation of interdisciplinary contexts for teaching literature. We conclude with concerns and design considerations for supporting interdisciplinary teaching with LLMs.
Rectifying Conformity Scores for Better Conditional Coverage
Plassier, Vincent, Fishkov, Alexander, Dheur, Victor, Guizani, Mohsen, Taieb, Souhaib Ben, Panov, Maxim, Moulines, Eric
We present a new method for generating confidence sets within the split conformal prediction framework. Our method performs a trainable transformation of any given conformity score to improve conditional coverage while ensuring exact marginal coverage. The transformation is based on an estimate of the conditional quantile of conformity scores. The resulting method is particularly beneficial for constructing adaptive confidence sets in multi-output problems where standard conformal quantile regression approaches have limited applicability. We develop a theoretical bound that captures the influence of the accuracy of the quantile estimate on the approximate conditional validity, unlike classical bounds for conformal prediction methods that only offer marginal coverage. We experimentally show that our method is highly adaptive to the local data structure and outperforms existing methods in terms of conditional coverage, improving the reliability of statistical inference in various applications.