Not enough data to create a plot.
Try a different view from the menu above.

Plotting





bb1443cc31d7396bf73e7858cea114e1-AuthorFeedback.pdf

Neural Information Processing Systems

But in the field of reinforcement learning (RL),L2-norm is the most common choice due to its5 efficiency and effectiveness. Thus we adoptL2-norm in the paper to ensure consistency between the objective of6 Andersonacceleration(AA)andthelossofQ-valuefunction(critic).7 Minors.