Local policy search with Bayesian optimization Sarah Müller

Neural Information Processing Systems 

Nevertheless, instead of systematically reasoning and actively choosing informative samples, policy gradients for local search are often obtained from random perturbations. These random samples yield high variance estimates and hence are sub-optimal in terms of sample complexity.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found