Towards Mixed Optimization for Reinforcement Learning with Program Synthesis