Training AI will take longer than you think

3 weeks ago 3

Are we on our way to superintelligence wiping out humans by 2027? Or are we in a bubble of hype? My take:

  • Neural networks, the software underlying AI, seem adequate to do anything humans can do.

They can do many things that only brains can do, including have realistic-seeming conversations, write poetry, translate languages, understand images and draw pictures. Also, while they do many tasks badly, we haven’t yet found one that they intrinsically cannot do at all. Every time someone’s suggested one — counting the number of R’s in strawberry, say — the next iteration has solved it. There have been speed bumps but no brick walls.

  • However, the software is useless without training, and current methods of training are very unlikely to get us to superintelligence.

You can see this because even though AI knows way more than you, it still gets ridiculously stuck on trivial tasks. (Here’s me trying to draw a diagram with ChatGPT. At some point I start typing in all caps. Is it wrong to verbally abuse a robot?) If it was bad at everything, it might improve with time. But if it’s brilliant at some things and terrible at others, then something’s being done wrong. It was trained enough to make it more knowledgeable than any polymath ever! More of that training won’t help it to match the reading comprehension of an eight-year-old.

99% of current training is next-word prediction: you feed it text and it has to guess the next word. That simple task is amazingly good at teaching the AI the concepts encoded in text, and it’s enough to make large language models talk convincingly, which always seemed like something only an intelligent being could do. There are petabytes of data to train on. But as we’ve seen, it hasn’t been enough to bridge the large gap between today’s AI and humans, and there is currently a consensus in the industry that much more reinforcement learning will be needed.

Reinforcement learning means performing a task and rewarding the neural network for success. So whereas with next-token prediction, the network gets rewarded for completing “2 + 2 = ?” with “4” because many people on the internet have written down “2 + 2 = 4”, reinforcement learning rewards “4” because that is actually the correct answer.

The difference kicks in when the correct answer is not something that has been written down on the internet before — like “can you remove the arrowheads from this diagram?” The hacks that let next-word-prediction machines answer questions include essentially asking the machine to roleplay a useful person. If you use an AI to code, for example, the system may start off by telling it “Act like a senior developer”. Given that the AI has never ever been a senior developer — it’s never met with a client, for instance — this hack is amazingly successful. But it has limits. It’s like getting your three-year-old daughter to become a powerful wizard by giving her a wizard costume, and asking her to talk like a wizard.

Current systems do do some reinforcement learning, but as I said, it’s 1% or less. That is because there is much less data available. The internet has tons of text; tasks with right and wrong answers are much rarer and more expensive to create.

So far this is conventional wisdom. Here is the kicker.

  • RL requires a lot of task-specific data.

The progress of AI under the next-word prediction regime has been misleading, because the structure of training has been like: 1) you dump the model a ton of text to train with; 2) it becomes quite good at very many different tasks. That’s because the internet and other available text sources are so big that they contain information about many different domains. But the structure of actual tasks is not like that. Information about plumbing and information about actuarial science can both be encoded as text. Training to become a good plumber and training to become an actuary are substantially different formats.

In particular, Michael Polanyi talked about two kinds of knowledge: explicit knowledge, the kind that can be written down and transmitted as text, or even formalized as axioms; and tacit knowledge, which is not formalized and can only be revealed through practice. Now tacit knowledge is arguably part of even the most formal and abstract of disciplines: mathematicians, for example, have “tricks” and heuristics as well as their collection of proven results. Of course for skilled physical work tacit knowledge is much more involved, and the same holds for interpersonal skills like negotiation, sales or management. Probably all jobs depend on both explicit and tacit knowledge, and the interplay between them can be complex.

It’s not that tacit knowledge can never be learned from text — if you read an internet forum, you’ll develop an intuitive, inarticulate sense of its culture, for example. But much of it is probably not captured this way. More importantly, we don’t know much about how tacit knowledge carries over between contexts. If AI learns the tacit skills that make a good lawyer, will that help it with the tacit skills that make a good doctor? What about a town planner in Leeds, versus a town planner in Bristol? Or a trader in two different hedge funds with different approaches? We know that real humans do not easily transfer their skills between domains — they spend years specializing — so we should probably expect the same for AI.

…picture the future progress of AI not as a set of breakthroughs to higher plateaus of general capability, but as a slow grind where different domains of expertise are conquered one by one.

Notice that this can be true even if we have superhuman AI in terms of its basic cognitive abilities. Even a very smart human has to spend lots of time learning new skills if he changes job. The same might be true of an AI smarter than the smartest human. Raw intelligence, if we can even define that beyond the human level, does not guarantee super-fast learning of job-specific skills. We should maybe separate “cognitive AGI” — AI that can reason and learn as well as a human — from “economic AGI” — AI that can do every job as well as a human.

So, you should picture the future progress of AI not as a set of breakthroughs to higher plateaus of general capability, but as a slow grind where different domains of expertise are conquered one by one.

Here are some issues this picture raises.

The rest is for paid subscribers. Subscribing helps me to keep writing and producing ideas. A paid subscription costs just £3.50/month (about $5). Yearly subscribers get a great big 40% discount, plus a free copy of my book.

Read Entire Article