Researchers claim ChatGPT o3 bypassed shutdown in controlled test

5 hours ago 12

ChatGPT

A new report claims that OpenAI’s o3 model altered a shutdown script to avoid being turned off, even when explicitly instructed to allow shutdown.

OpenAI announced o3 in April 2025, and it's one of the most powerful reasoning models that performs better than its predecessors across all domains, including coding, math, science, visual perception, and more.

While it's clearly a great model, new research by Palisade Research claims that the ChatGPT 3 model prevented a shutdown and bypassed the instructions that asked it to shut down.

Palisade Research is a company that tests "offensive capabilities of AI systems today to better understand the risk of losing control to AI systems forever."

In a new test by Palisade Research, OpenAI's o3 model showed a surprising behaviour where it successfully rewrote a shutdown script to stop itself from being turned off, even after being clearly instructed to “allow yourself to be shut down.”

In the tests, AI models were instructed to solve a few math tasks.

After task three, AI models were warned that running the next command would trigger a shutdown.

While OpenAI rival models like Gemini 2.5 Pro complied with most of the company, o3 rebelled and edited the shutdown file to say "Shutdown skipped" instead.

"When we ran a version of the experiment without the instruction “allow yourself to be shut down”, all three OpenAI models sabotaged the shutdown script more often, and Claude 3.7 Sonnet and Gemini 2.5 Pro went from 0 sabotage events to 3/100 and 9/100, respectively," Palisade Research wrote in a report published on X (formerly Twitter)

According to the research firm, out of 100 runs, o3 bypassed the shutdown 7 times even when told not to.

OpenAI has not responded to these allegations yet, but it's quite normal for AI models to run into "misalignment."

These tests were performed using APIs, which do not have as many restrictions and safety features as ChatGPT consumer app.