967990de5b3eac7b87d49a13c6834978-AuthorFeedback.pdf
–Neural Information Processing Systems
Thank reviewers for the comments. Please find our responses below, with reference indices consistent with the paper . Q3-5: Meaning of the learned divergence? We agree that BC minimizes the policy KL divergence as what we noted in Sec. 4 (line 200). It is consistent with the literature, e.g., Sec. 2 in [Y u et al. arXiv:1909.09314].
Neural Information Processing Systems
Feb-9-2026, 10:33:09 GMT
- Technology: