AITopics | Education

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention. Can we overcome these challenges by building agents that learn from rich, interactive feedback instead? We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher. We begin by formalizing a class of human-in-the-loop decision making problems in which multiple forms of teacher-provided advice are available to a learner. We then describe a simple learning algorithm for these problems that first learns to interpret advice, then learns from advice to complete tasks even in the absence of human supervision. In puzzle-solving, navigation, and locomotion domains, we show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms and often less than imitation learning.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Education > Educational Setting > Online (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

376c6b9ff3bedbbea56751a84fffc10c-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 11:38:15 GMT

artificial intelligence, machine learning, student, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

Does Knowledge Distillation Really Work? Samuel Stanton NYU Pavel Izmailov NYU Polina Kirichenko NYU Alexander A. Alemi Google Research Andrew Gordon Wilson NYU

Neural Information Processing SystemsApr-25-2026, 11:38:11 GMT

Knowledge distillation is a popular technique for training a small student network to emulate a larger teacher model, such as an ensemble of networks. We show that while knowledge distillation can improve student generalization, it does not typically work as it is commonly understood: there often remains a surprisingly large discrepancy between the predictive distributions of the teacher and the student, even in cases when the student has the capacity to perfectly match the teacher. We identify difficulties in optimization as a key reason for why the student is unable to match the teacher. We also show how the details of the dataset used for distillation play a role in how closely the student matches the teacher -- and that more closely matching the teacher paradoxically does not always lead to better student generalization.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning

Neural Information Processing SystemsApr-25-2026, 11:19:19 GMT

We consider the problem of online learning in the presence of distribution shifts that occur at an unknown rate and of unknown intensity. We derive a new Bayesian online inference approach to simultaneously infer these distribution shifts and adapt the model to the detected changes by integrating ideas from change point detection, switching dynamical systems, and Bayesian online learning. Using a binary'change variable,' we construct an informative prior such that-if a change is detected-the model partially erases the information of past model updates by tempering to facilitate adaptation to the new data distribution. Furthermore, the approach uses beam search to track multiple change-point hypotheses and selects the most probable one in hindsight. Our proposed method is model-agnostic, applicable in both supervised and unsupervised learning settings, suitable for an environment of concept drifts or covariate drifts, and yields improvements over state-of-the-art Bayesian online learning approaches.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report (0.46)

Industry:

Education > Educational Setting > Online (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
(2 more...)

Add feedback

348a38cd25abeab0e440f37510e9b1fa-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 10:47:38 GMT

artificial intelligence, machine learning, statistic, (15 more...)

Neural Information Processing Systems

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Neural Pseudo-Label Optimism for the Bank Loan Problem

Neural Information Processing SystemsApr-25-2026, 10:29:10 GMT

We study a class of classification problems best exemplified by the bank loan problem, where a lender decides whether or not to issue a loan. The lender only observes whether a customer will repay a loan if the loan is issued to begin with, and thus modeled decisions affect what data is available to the lender for future decisions. As a result, it is possible for the lender's algorithm to "get stuck" with a self-fulfilling model. This model never corrects its false negatives, since it never sees the true label for rejected data, thus accumulating infinite regret. In the case of linear models, this issue can be addressed by adding optimism directly into the model predictions. However, there are few methods that extend to the function approximation case using Deep Neural Networks.

artificial intelligence, dataset, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry:

Information Technology (0.93)
Banking & Finance > Loans (0.85)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.66)

Add feedback