The Limits of Deep Learning


GPT-3, the latest state-of-the-art in Deep Learning, achieved incredible results in a range of language tasks without additional training. The main difference between this model and its predecessor was in terms of size. GPT-3 was trained on hundreds of billions of words -- nearly the whole Internet -- yielding a wildly compute-heavy, 175 billion parameter model. OpenAI's authors note that we can't scale models forever: "A more fundamental limitation of the general approach described in this paper -- scaling up any LM-like model, whether autoregressive or bidirectional -- is that it may eventually run into (or could already be running into) the limits of the pretraining objective." This is the law of diminishing returns in action.

Duplicate Docs Excel Report

None found

Similar Docs  Excel Report  more

None found