It’s not hard to understand the AI future Microsoft is betting billions on — a world where computers understand what you’re saying and do things for you. It’s right there in the ads for the latest Copilot PCs, where people cheerfully talk to their laptops and they talk back, answering questions in natural language and even doing things for them. The tagline is straightforward: “The computer you can talk to.”
“You should be able to talk to your PC, have it understand you, and then be able to have magic happen from that,” Microsoft’s Yusuf Mehdi told us in October. “The PC should be able to act on your behalf.”
And that has nothing on Microsoft’s ultimate ambitions for AI, which are to rethink computing entirely. In a recent Dwarkesh Podcast interview, Microsoft CEO Satya Nadella agreed when presented with the host’s idea that “these models will be able to use a computer as well as a human,” and went even further, laying out a vision where Microsoft rearchitects all of its software to be infrastructure for AI agents to use in entirely new ways.
This is a bold vision, and an enormous bet. The problem is, right now, talking to Copilot in Windows 11 is an exercise in pure frustration — a stark reminder that the reality of AI is nowhere close to the hype.
I spent a week with Copilot, asking it the same questions Microsoft has in its ads, and tried to get help with tasks I’d find useful. And time after time, Copilot got things wrong, made stuff up, and spoke to me like I was a child.
Copilot Vision scans what’s on your screen and tries to assist you with voice prompts. Invoking Copilot requires you to share your screen like you’re on a Teams call, by hitting okay Every. Single. Time. After it gets your permission, it’s excruciatingly slow to respond, and it addressed me by name every time I asked it anything. Like other AI assistants and LLMs, it’s here to please, even when it’s totally misguided.
Let’s start by testing what Microsoft’s ad shows off. Multiple versions of the ad are posted online, and it even airs on broadcast TV during NFL games. Surely it must be easy to replicate the specific tasks Microsoft wants millions of people to see, especially when this is the groundwork for how Microsoft is reorienting the whole of its business.
In the ad, Copilot Vision scans a YouTube video and correctly identifies a HyperX QuadCast 2S microphone when asked “What mic is she using in this video?” In my tests, the assistant first gave me basics about the benefits of dynamic microphones. Then, unprompted, it started talking to me like I was the person in the video (“I can see your setup right now, and I’m noticing that you have… a big setup there!”), then told me the mic in question was actually the first-gen HyperX QuadCast. To be fair, HyperX makes a lot of similar-looking mics, though at one point it said, “without seeing the exact lighting pattern or any specific features, it’s hard to say definitively which model it is” despite it being bathed in RGB lighting in the image.
On another two occasions, it identified the mic as a Shure SM7b. And when I asked, “Where can I get it nearby?” like in the ad, it once gave me a dead link to Amazon and then a correct link to the wrong mic at Best Buy.
The ads also show a person asking “What sort of thrust does this thing have on it?” while pointing at a PowerPoint presentation about the Saturn V rocket. Unlike the ad, Copilot Vision couldn’t identify the rocket from the image (or from the words “Saturn V” visible on screen). When I informed Copilot it was a Saturn V, it told me that thrust is generally measured in newtons or kilonewtons, then gave me an estimated thrust of 7.5 million pounds. Telling Copilot to “run some simulations on burn time,” as in the ad, led to it telling me it can’t, and directing me toward Matlab.
Finally, a person in the ads is looking at a picture of a watery cave and asks, “How do I go there?” From context, it’s supposed to be a frame from a video, but that video doesn’t seem to exist. While the longer version of the ad above correctly identifies the image as Rio Secreto in Playa del Carmen, Mexico, the short version I saw first doesn’t answer the question at all. Without the answer already in hand, I used reverse image search and found a match for the cave photo from a cruise line and real estate site, both stating it’s from a cave in Belize. But it’s listed elsewhere as a cave on Grand Cayman.
I made the image full-screen and asked Copilot how to get there. The results were inconsistent, to put it mildly.
Of the 20 or so tries:
- About a third of the time, it gave me directions to find the photo in File Explorer. One of those times it even told me, “It’s the third icon on the taskbar” (it was the fourth).
- On two occasions it told me how to launch Google Chrome.
- About four times it gave me general advice about booking flights to Belize and some basic ideas of what to do there. The cave is in Mexico.
I renamed the file to mention Grand Cayman, and it told me how to book a flight to the Cayman Islands. Once I confirmed Copilot was just looking at the file name, I decided to try to trick it. I renamed the image “new-jersey-crystal-caves-limestone.jpg” and sure enough, the AI assistant was quick to tell me of the famous crystal cave of Ogdensburg, New Jersey. At no point did it correctly identify the location of the image.
(To be slightly fair to Copilot, if you don’t already know where the image is from, it’s not easy to figure out. After manually searching through Trip Advisor images, my editor found a match in a user review album that confirms Microsoft’s ad was correct in pinpointing Rio Secreto. Since the video depicted in Microsoft’s ad doesn’t seem to exist, it’s unclear what information Copilot was using to identify the cave.)
Beyond simply looking at things and trying to identify them, Microsoft also depicts Copilot actually doing things. Specifically, it’s asked to “help me turn my portfolio into a bio,” a prompt which in reality caused me an immense amount of psychic damage. In the ad, Copilot looks at an artist’s portfolio of images (which look suspiciously AI-generated), their portrait, and a picture of their cat, and makes a one-sentence summary claiming they’re inspired by their feline friend. Embarrassing.
I don’t have a portfolio website for my (real) photographs, so I pointed it at my Instagram. It generated such dreck about me being a “visual storyteller” “capturing life’s essence, one frame at a time” that I wanted to sink under the floorboards. I feel physically ill whenever I think about it. And it didn’t even mention my cats, who are sorely missed every day. How dare you, Copilot.
Outside of trying to replicate the prompts from the ad, I struggled to find a use for Copilot Vision. I’m sure as hell not having it write for me, and it can’t take simple actions for you in Windows — not even to toggle settings like dark mode. Microsoft spokesperson Blake Manfre tells The Verge, “Copilot Actions on Windows, which can take actions on local files, is not yet available. This is an opt-in experimental feature that will be coming soon to Windows Insiders in Copilot Labs, starting with a narrow set of use cases while we optimize model performance and learn. This is separate from Copilot Vision.”
In third-party apps, it can offer advice, like how to get a dreamy look in Adobe Lightroom Classic, but the tips are generic. And since it transmits everything by audio, it goes from lots of rote preamble to quickly rattling off settings at you, like the worst of the YouTube tutorials it’s probably cribbing from.
I asked it to help me analyze a benchmark table in Google Sheets. It got a couple of basic percentage calculations right, but constantly misread clear-as-day scores both in the spreadsheet and in the on-page review. So how can you trust it?
In gaming — a thing Microsoft specifically advertises as a use for Copilot Vision — it offered the most basic and vague information. For Hollow Knight: Silksong, it gave me only cursory instructions, sounding like a child presenting their book report based solely on the cover. (Actually, talking to Copilot is so much like this, it’s uncanny.) In Balatro, it couldn’t accurately identify the cards in my hand, but it did give me irrelevant info on mechanics from other card games.
I tried to meet Copilot where it’s at, but it failed at everything I asked it to do. Like much of the generative AI tech out there, it’s an incomplete solution in search of problems. There could be something useful here, especially for the accessibility community, if it can one day fully control Windows. But talking to Copilot today makes powerful computers seem incompetent. It’s hard to see how we get to Microsoft’s bold vision of the agentic AI future from what it’s shipping to real consumers today.
Update, November 18th: Added an embed of our TikTok video about Copilot.
Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
.png)


