Reconciling λ-Returns with Experience Replay

Brett Daley, Christopher Amato

Neural Information Processing Systems 

A unique benefit to this approach is that each transition's TD error can be

Similar Docs  Excel Report  more

TitleSimilaritySource
None found