Human vs. Muppet: A Conservative Estimate of Human Performance on the GLUE Benchmark