Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning