Clues for Which I Search and Choose
Before we leave these model-free chronicles behind, let me turn to the converse of the Linearization Principle. We have seen that random search works well on simple linear problems and appears better than some RL methods like policy gradient. Does random search break down as we move to harder problems? Let's apply random search to problems that are of interest to the RL community. The deep RL community has been spending a lot of time and energy on a suite of benchmarks, maintained by OpenAI and based on the MuJoCo simulator.
Mar-23-2018, 07:13:20 GMT
- Technology: