Goto

Collaborating Authors

 machine teaching


0cd6a652ed1f7811192db1f700c8f0e7-Paper.pdf

Neural Information Processing Systems

Large language models have recently shown a remarkable ability for few-shot learning, including patterns of algorithmic nature. However, it is still an open question to determine what kind of patterns these models can capture and how many examples they need in their prompts. We frame this question as a teaching problem with strong priors, and study whether language models can identify simple algorithmic concepts from small witness sets. In particular, we explore how several GPT architectures, program induction systems and humans perform in terms of the complexity of the concept and the number of additional examples, and how much their behaviour differs. This first joint analysis of language models and machine teaching can address key questions for artificial intelligence and machine learning, such as whether some strong priors, and Occam's razor in particular, can be distilled from data, making learning from a few examples possible.



Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners

Neural Information Processing Systems

In real-world applications of education, an effective teacher adaptively chooses the next example to teach based on the learner's current state. However, most existing work in algorithmic machine teachingfocuses on the batch setting, where adaptivity plays no role. In this paper, we study the case of teaching consistent, version space learners in an interactive setting. At any time step, the teacher provides an example, the learner performs an update, and the teacher observes the learner'snew state.


Machine Teaching of Active Sequential Learners

Neural Information Processing Systems

On the other hand, for goal-oriented tasks, humans create mental models of the environment for planning their actions to achieve their goals [1,2]. In AI systems, recent research has shown that usersformmentalmodelsoftheAI'sstateandbehaviour[ 3,4].


Teaching Inverse Reinforcement Learners via Features and Demonstrations

Neural Information Processing Systems

Weintroduceanaturalquantity,the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms basedoninversereinforcement learning. Basedonthesefindings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimalpolicy.


IterativeTeacher-AwareLearning

Neural Information Processing Systems

In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. Theteacher adjusts herteaching method fordifferent students, and the student, after getting familiar with the teacher's instruction mechanism,caninfertheteacher'sintentiontolearnfaster.