The Double Descent Hypothesis Explains How Bigger Models can Hurt Performance

Open in new window