RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning