A Additional Benchmark Information 354 A.1 Offline

Neural Information Processing Systems 

Figure 5: Graphical representation of the normalized performance of the best trained policy on D4RL averaged over 4 random seeds. Figure 15: Graphical representation of the normalized performance of the last trained policy on D4RL after online tuning averaged over 4 random seeds. Our codebase is released under Apache License 2.0. For most of the algorithms and datasets, we use default hyperparameters if available. Decision Transformer (DT) training is splitted into datasets pass epochs.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found