4496bf24afe7fab6f046bf4923da8de6-AuthorFeedback.pdf

Feb-8-2026, 06:16:53 GMT–Neural Information Processing Systems

Thisisespeciallytrue3 because practical deployments of RL are bottle-necked by its poor sample efficiency. Wedidn'tknowabout D4RL when writing thepaper (it17 is a recent preprint), but we ran the experiment on maze2d-umaze now (Fig. a). Our model significantly outperforms the baselines and the20 ablations. Our experiment on D4RL also shows clear improvement overbaselines and ablations (Fig a.).46 On WalkerParam, we agree with your analysis and will clarify in the paper that the performance improvement in47 WalkerParam comes from distillation.

experiment, unseen task, walkerparam, (1 more...)

Neural Information Processing Systems

Feb-8-2026, 06:16:53 GMT

Conferences PDF

Add feedback

Duplicate Docs Excel Report

Title
4496bf24afe7fab6f046bf4923da8de6-AuthorFeedback.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found