Training on the Test Task Confounds Evaluation and Emergence