A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning