The Double Descent Hypothesis Explains How Bigger Models can Hurt Performance