Why I think strong general AI is coming soon - LessWrong

#artificialintelligence 

I think there is little time left before someone builds AGI (median 2030). Once upon a time, I didn't think this. This post attempts to walk through some of the observations and insights that collapsed my estimates. A single invocation of GPT-3, or any large transformer, cannot run any algorithm internally that does not run in constant time complexity, because the model itself runs in constant time. It's a very large constant, but it is still a constant. They don't have any learnable memory about their internal state from previous invocations. They just have the input stream. Despite all their capability, transformers are fundamentally limited.[1] This is part of the reason why asking GPT-3 to do integer division on large numbers in one shot doesn't work. GPT-3 is big enough to memorize a number of results, so adding small numbers isn't too hard even without fine tuning. And GPT-3 is big enough to encode a finite number of unrolled steps for more complex algorithms, so in principle, fine tuning it on a bunch of arithmetic could get you better performance on somewhat more complex tasks. But no matter how much retraining you do, so long as you keep GPT-3's architecture the same, you will be able to find some arithmetic problem it can't do in one step because the numbers involved would require too many internal steps. So, with that kind of limitation, obviously transformers fail to do basic tasks like checking whether a set of parentheses are balanced... Oh wait, GPT-3 was just writing dialogue for a character that didn't know how to balance parentheses, and then wrote the human's side of the dialogue correcting that character's error. And it writes stories with a little assistance with long-run consistency. And it can generate functioning code. Some of this is already productized. This is an architecture that is provably incapable of internally dividing large integers, and it can handle a variety of difficult tasks that come uncomfortably close to human intuition. Could the kind of intelligence we care about be algorithmically simpler than integer division? This can't be literally true, if we want to include integer division as something a generally intelligent agent can do. But it sure looks like tractable constant time token predictors already capture a bunch of what we often call intelligence, even when those same systems can't divide! I'm raising my eyebrows right now to emphasize it!

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found