Finetuning Deep Reinforcement Learning Policies with Evolutionary Strategies for Control of Underactuated Robots