Variance Reduction for Evolution Strategies via Structured Control Variates

Tang, Yunhao, Choromanski, Krzysztof, Kucukelbir, Alp

May-29-2019–arXiv.org Machine Learning

Evolution Strategies (ES) are a powerful class of blackbox optimization techniques that recently became a competitive alternative to state-of-the-art policy gradient (PG) algorithms for reinforcement learning (RL). We propose a new method for improving accuracy of the ES algorithms, that as opposed to recent approaches utilizing only Monte Carlo structure of the gradient estimator, takes advantage of the underlying MDP structure to reduce the variance. We observe that the gradient estimator of the ES objective can be alternatively computed using reparametrization and PG estimators, which leads to new control variate techniques for gradient estimation in ES optimization. We provide theoretical insights and show through extensive experiments that this RL-specific variance reduction approach outperforms general purpose variance reduction methods.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

May-29-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (1.00)
  - Machine Learning
    - Neural Networks (1.00)
    - Statistical Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found