Plausibility is not truth: Do you really understand AI?

3 hours ago 1

This article is taken from the August-September 2025 issue of The Critic. To get the full magazine why not subscribe? Right now we’re offering five issues for just £25.

“I’m sorry, Dave, I can’t do that.” Years before we had computers that were even approaching claims of intelligence, science fiction was preparing us for the moment they might go wrong. Would they, like 2001’s HAL, decide that their mission would be a lot easier if their human weren’t on board the spaceship? Would they, like The Terminator’s SkyNet, decide humanity was the enemy?

As it turns out, the thing for which blockbusters didn’t prepare us was the thing we got: computers that just make stuff up. Barely a week goes by without someone sharing a tale of an artificial intelligence providing false information. Lawyers using AI to help them prepare arguments have been caught out because the computer cited non-existent cases.

US Health Secretary Robert F. Kennedy produced a report on diseases in children quoting false studies. Google now gives, as its first answer to any search, a response generated by AI, one that is frequently incorrect.

It’s not hard to find people who think that AI is the wonder solution to every problem. Keir Starmer has followed Rishi Sunak in grabbing at it as something that will fix the NHS and raise our productivity, helping him to avoid difficult problems he otherwise faced. Media companies wondering if there’s an even cheaper way of producing content are similarly enthusiastic.

In fact, it has created new problems. Employers describe being overwhelmed by plausible job applications. Only when the candidate arrives do they realise the person clearly didn’t write that letter themselves and sometimes not even then. One person recruiting for a senior position asked shortlisted applicants to prepare a presentation.

Afterwards he put the request into ChatGPT and found himself looking at things very much like two of the presentations he’d watched that afternoon. Professional fraudsters now have a tool that can generate pictures, video and audio.

There are other areas for backlash: the amount of energy used to train AI models and generate answers, the massive breach of copyright involved in getting data for training, the threat it represents to jobs. None of those is likely to stop governments and companies adopting it. But if we’re going to use AI, we need to understand what it is, and what it’s not.

It is possible that, at some point in the near future, someone will develop an AI that can answer every question with high or total accuracy. But that is not today’s AI. It is not — and this is the crucial point — even what today’s AI is designed to be.

There are several rival “generative AI” systems, but the best known is OpenAI’s ChatGPT — used, the company says, by 500 million people every week. They all use a similar model: to simplify a little, this involves feeding a vast network of computer processors with millions upon millions of sentences, so that they can calculate, according to context, what the next word in a sentence is likely to be. (The need for a very large number of high-quality sentences explains why it has been necessary to use stolen copies of books.)

Some versions can search the internet or use specialised tools for particular tasks. Some have been told to pay particular attention to the text they see in recent newspapers in an attempt to keep them up to date; maybe selling subscriptions to the Terminator is the future of media.

To say that all ChatGPT does is generate plausible sentences is not to denigrate it. This is an amazing achievement in computer science terms. But it also explains why it so often gets things wrong. Plausible sentences are not the same thing as accurate ones. This would be fine, if the people using ChatGPT understood what it was. The evidence is that a lot of them don’t.

It isn’t just the people caught out using AI to write their essays, their legal arguments and their plans to improve a G7 nation’s health. To listen to OpenAI saying their system “exceeds human PhD-level accuracy on a benchmark of physics, biology and chemistry problems”, you might think it was a pretty good place to go for answers. It’s hardly surprising that people are using it for financial planning, to diagnose an illness or to explain a legal contract.

But read into OpenAI’s documentation, and you find its systems are amazingly unreliable. Two versions it released in April scored 49 per cent and 20 per cent for accuracy on questions of fact. There are multiple versions of ChatGPT and multiple tests, but it’s common to find two-thirds or more of its answers are what the industry euphemistically calls “hallucinations”: plausible but simply incorrect sentences. You can use it, but you really shouldn’t trust it.

Some people see these errors and complain AI is “lying” to them or “gaslighting” them. This also is to misunderstand it. AI is no more capable of lying to you than your car is. It works as designed: they asked it for plausible sentences; it fed some back.

It’s surprisingly difficult for us to get our heads around the idea that these plausible sentences don’t come from an intelligence. But AI researchers have known for decades that it would be difficult. They noticed it with one of the very first chatbots.

Back in the 1960s, MIT professor Joseph Weizenbaum developed a piece of conversation-simulating software he called ELIZA. The most popular variant saw ELIZA play the part of a psychotherapist.

“PLEASE TELL ME YOUR PROBLEM,” the computer would ask.

“I’m feeling depressed,” you might reply.

“I AM SORRY TO HEAR YOU ARE DEPRESSED”

“I hate my life. My work, my family, all of it.”

“WHY DO YOU SAY YOUR WORK”

It’s not the greatest — or best-punctuated — conversation in the world, but ELIZA did seem, for the first few questions, to be engaging with you. At least at first.

You might reply: “My job doesn’t make me happy.”

To which ELIZA would say something like: “DOES THAT SUGGEST ANYTHING ELSE WHICH BELONGS TO YOU” — an answer that makes no sense. Which is hardly surprising, because ELIZA’s programming was simply to identify keywords in sentences and use them to generate replies.

ELIZA was pretty obviously not intelligent. The code wasn’t even designed to simulate intelligence. But that didn’t stop people treating ELIZA as intelligent.

Weizenbaum was shocked to notice that “extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people”. Someone who knew this was just a computer — who had watched him write the code — still engaged with ELIZA as though there was an intelligence there. It became known as the “ELIZA Effect”: we see human-like behaviour in computers.

ELIZA was built 60 years ago. The software was plausible for a few questions but quickly became repetitive or nonsensical. ChatGPT on the other hand is a much more complicated system, built specifically to sound plausible. But that plausibility can be dangerous.

The New York Times reported in June that it had been repeatedly contacted by people who had been told by ChatGPT to tell the paper about some “world-altering truth”, generally a conspiracy theory. It convinced one of them that he was living in a simulation and told him that if he jumped off a tall building, he would fly. Fortunately, he didn’t test this.

OpenAI hardly help. They say their product can “think”, “learn” and “reason”, though if it does these things, it doesn’t do them in the way humans generally recognise. To be charitable, they face the problem of describing its functions to lay audiences that know nothing of neural networks. To be less charitable, the company has spent a lot of money on its system, and it doesn’t hurt for the public to believe it’s a wonder-product.

Which AI is in all sorts of ways. A decade ago, if I wanted a transcript of an interview, I had to either type it out myself, something that would take hours; or pay someone else to do the job, an expensive and slow business. Now, like every other journalist I know, I use Otter, trained on audio files, which delivers instant and pretty reliable results. It’s not perfect: the Sunday Times recently reported two senior members of the government drinking tequila, when Otter misunderstood “to Keir”. But it’s pretty good.

Then there’s AI-generated art, trained on lots and lots of images. It’s controversial amongst the artists who created those pictures and now lose work to it, but if you want a poster for your choir’s next concert, AI can generate you a beautiful one in seconds.

More than that, it can smarten up your PowerPoints, help fix spreadsheets, help process data or write computer code. And there is a place for plausible sentences: lots of people struggle to write letters to their landlord or bank or lawyer. If we all knew how to express ourselves, Hallmark would have gone out of business.

Even with its hallucinations, AI is a useful tool if you understand what it is. The issue is that so many of its users don’t, and they are relying on it in fields they don’t understand themselves, where they can’t spot the mistakes. They are Professor Weizenbaum’s colleagues, seeking emotional support from code that neither knows nor understands them.

Isaac Asimov, whose I, Robot stories were some of the earliest to think seriously about issues around AI, didn’t predict the problem of it making things up. But he did come closer than anyone else to suggesting that the problems with intelligent computers would very often turn out to be caused by humans using them. To put it in a way that computer engineers understand, the problem is located between the screen and the chair.

Read Entire Article

Plausibility is not truth: Do you really understand AI?

Related

A simulator significantly inspired by the first commercial t...

Characterizing Fitness Landscape Structures in Prompt Engine...

Ask HN: Does vibecoding destroy teens learning to code, or h...