9ac403da7947a183884c18a67d3aa8de-AuthorFeedback.pdf
–Neural Information Processing Systems
'mean'isthecorrectone.28 - We also looked at "softer" target policies, where similar conclusions can be drawn (see appendix). Q2, C1(a-e), C2, and C3: we will takeall these points into consideration, and work to maximize clarity in the final48 revision.