Anthropic’s latest AI model spent 30 hours running by itself to code a chat app akin to Slack or Teams. It spat out about 11,000 lines of code, according to Anthropic, and it only stopped running when it had completed the task.
The model, Claude Sonnet 4.5, was announced today, and its ability to operate autonomously for 30 hours straight is a huge jump forward. Before, the company’s Opus 4 model made headlines in May for its ability to operate for seven hours.
It’s all a significant step in Anthropic’s battle to corner the market on both AI agents and AI coding. The company called Claude Sonnet 4.5 “the best model in the world for real-world agents, coding, and computer use” and said it “leads the market at using computers,” referencing the Computer Use feature Anthropic debuted nearly a year ago. The new model is particularly adept in fields like cybersecurity, financial services, and research, according to Anthropic. One of its beta-testers, Canva, said the new model helped with “complex, long-context tasks—from engineering in our codebase to in-product features and research.”
Anthropic, OpenAI, Google, and other companies have been continuously releasing incremental updates and features that allow their technology to act as an assistant both for consumers (researching topics, scheduling meet-ups, and looking up flights) and for enterprise and developer use (creating slide decks, helping with coding tasks, and analyzing spreadsheets). The battle for attention and reliance heats up nearly every month, if not every week. Days ago, OpenAI announced Pulse, its newest ChatGPT feature designed to be part of users’ morning routines and research topics relevant to their days.
Anthropic also said the new model would be paired with other updates to help developers code their own AI agents.
“We’re combining the launch of the model with access to virtual machines, memory, context management, and multi-agent support,” the company wrote in a release. “This essentially packages the same building blocks that power Claude Code - enabling developers to build their own cutting-edge agents.”
Dianne Penn, a head of product management at Anthropic, told The Verge in an interview that the model’s improvements in its computer use capabilities surprised even her. Claude Sonnet 4.5 is more than three times as skilled at navigating a browser and using a computer compared to Anthropic’s tech from last October. Penn said the team had received feedback from early-access customers — “the GitHubs and Cursors of the world” — and spent the past month working intensively on the model.
Scott White, product lead for Claude.ai, told The Verge that the new model operates at “chief-of-staff level” and can find availability between multiple peoples’ calendars and schedule a meeting, look at a data dashboard and pull together insights, write status updates based on one-on-one meetings with his direct reports, and more.
Neither White nor Penn had yet tried vibe-coding with the new model when The Verge spoke to them. But Penn said she uses Claude Sonnet 4.5 for hiring potential new team members at Anthropic.
“It’s been actually really helpful to have a continuous running prompt that I use of, ‘Do a deep web search, come up with like these parameters for profiles to source for certain types of roles on my team,’” Penn said. “That’s been really, really helpful. And I’ve seen the Sonnet 4.5 just do even better than in the past, on the quality and the depth of the searches and actually generating a spreadsheet with LinkedIn profiles so then I can email them.”
Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.