b6fa3ed9624c184bd73e435123bd576a-Supplemental-Conference.pdf

Neural Information Processing Systems 

Understanding how CompILE performs over a mix of expert types, and how to enable more complex adaptation to a specific student's needs at the This requires knowledge of how individual skills serve the ultimate task's goal For convenience, we provide a glossary of all mathematical notation used in our framework.T erm Meaning Ξ Set of skill labels corresponding to an Expert e's trajectory for scenario ξ E Expertise vector for a given studentb For the rightmost character ("na" in Balinese), students learn to draw a smaller Black dots represent skill boundaries identified by CompILE. To address the challenges of extracting "human teachable" skills discussed above, one may consider Therefore, in this work we attempted to incoporporate preliminary notions of "human teachability" Although we report results for both half-trained and reverse difficulty synthetic students after fine-tuning on 100 epochs, one natural question is the effect of training time. Figure 10: Reward starts to plateau over training iterations for the "reversing difficulty" synthetic student. Reported values are average reward over 100 random rollouts. As described in Sec. 5 of the main paper, our P For the purpose of simplifying our user study, we make the following modifications: 1. IRB-approved study (Protocol No. 49406 reviewed by Stanford University).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found