Theoretical Analysis of Learning with Reward-Modulated Spike-Timing-Dependent Plasticity