submission, we have tested MOPO on a non-MuJoCo environment: an HIV treatment simulator slightly modified

Neural Information Processing Systems 

We thank all the reviewers for the constructive feedback. "fairly limited in terms of applicability... the ability to extend this work to more general settings?" The task simulates the sequential decision making in HIV treatment. We show results in Table 1, where MOPO outperforms BEAR and achieves almost the buffer max score. Buffer Max Buffer Mean SAC (online) BEAR MOPO 15986.2

Similar Docs  Excel Report  more

TitleSimilaritySource
None found