When random search is not enough: Sample-Efficient and Noise-Robust Blackbox Optimization of RL Policies