Normalization and effective learning rates in reinforcement learning Clare Lyle