AnEfficientAsynchronousMethodforIntegrating EvolutionaryandGradient-basedPolicySearch

Neural Information Processing Systems 

These have the opposite properties, with DRL having good sample efficiencyandpoor stability, while ESbeing vice versa. Recently,there havebeen attempts tocombine these algorithms, butthesemethods fullyrelyonsynchronous updatescheme, making it not ideal to maximize the benefits of the parallelism in ES.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found