Everyone Is Using A.I. for Everything. Is That Bad? – The New York Times:
ROOSE And then, of course, there’s the hallucination problem: These systems are not always factual, and they do get things wrong. But I confess that I am not as worried about hallucinations as a lot of people — and, in fact, I think they are basically a skill issue that can be overcome by spending more time with the models. Especially if you use A.I. for work, I think part of your job is developing an intuition about where these tools are useful and not treating them as infallible. If you’re the first lawyer who cites a nonexistent case because of ChatGPT, that’s on ChatGPT. If you’re the 100th, that’s on you.
NEWTON Right. I mentioned that one way I use large language models is for fact-checking. I’ll write a column and put it into an L.L.M., and I’ll ask it to check it for spelling, grammatical and factual errors. Sometimes a chatbot will tell me, “You keep describing ‘President Trump,’ but as of my knowledge cutoff, Joe Biden is the president.” But then it will also find an actual factual error I missed.
But how do you know the “factual error” it found is an actual factual error, not the kind of hallucination that Kevin Roose says he’s not worried about? Newton a little later in the conversation:
How many times as a journalist have I been reading a 200-page court ruling, and I want to know where in this ruling does the judge mention this particular piece of evidence? L.L.M.s are really good at that. They will find the thing, but then you go verify it with your own eyes.
First of all, I’m thinking: Hasn’t command-F already solved that problem? Does Newton not know that trick? Presumably he does, unless he’s reading the entire “200-page court ruling” to “verify with [his] own eyes” what the chatbot told him. So:
Casey Newton’s old workflow: Command-f to search for references in a text to a particular piece of evidence.
Casey Newton’s new AI-enhanced workflow: Ask a chatbot whether a text refers to a particular piece of evidence. Then use command-f to see if what the chatbot told him is actually true.
Now that’s what I call progress!
The NYT puts that conversation on its front page next to this story:

But as almost everyone who has ever used a chatbot knows, the bots’ “ability to read and summarize text” is horribly flawed. Every bot I have used absolutely sucks at summarizing text: they all get some things right and some things catastrophically wrong, in response to almost any prompt. So until the bots get better at this, then machine learning “will change the stories we tell about the past” by making shit up.
“Brain donor” is a cheap insult, but I feel like we’re seeing mind donation in real time here. Does Newton really fact-check the instrument he uses to check his facts? This is the same guy who also notes: “Dario Amodei, the chief executive of Anthropic, said recently that he believes chatbots now hallucinate less than humans do.” Newton doesn’t say he believes Amodei — “I would like to see the data,” he says, as if there could be genuine “data” on that question — but to treat a salesman’s sales pitch for his own product as a point to be considered on an empirical question is a really bad sign.
I won’t be reading anything Newton writes from this point on — because why would I? He doesn’t even think he has anything to offer — but I bet (a) in the next few months he’ll get really badly burned by trusting chatbot hallucinations and (b) that won’t change the way he uses them. He’s donated his mind and I doubt he’ll ask for it to be returned.
.png)
  
