Goto

Collaborating Authors

 issmall


Appendixto" Auxiliary TaskReweightingfor Minimum-dataLearning " AnonymousAuthor(s) Affiliation Address email

Neural Information Processing Systems

First we remove the dependency on the integral by taking its lower33 boundandupperbound. This is the case whene KLα is large (see Figure 1a). This assumption holds as long as there is at least one task that is related to the main task (having59 a smallKLα), which is reasonable because if all the tasks are unrelated, then reweighing is also60 meaningless. Specifically, we find the results insensitive to the choice ofβ. Only 1000 out of 65392 images are147 labeled.


OnEfficiencyinHierarchicalReinforcement Learning

Neural Information Processing Systems

While this has been demonstrated empirically overtimeinavarietyoftasks,theoretical resultsquantifying thebenefits of such methods are still few and far between. In this paper, we discuss the kind of structure in a Markov decision process which gives rise to efficient HRLmethods.