Caveat Prompter

4 weeks ago 2

In the wake of a major incident, you’ll occasionally hear a leader admonish the engineering organization that we need to be more careful in the future in order to prevent such incidents from happening in the future. Ultimately, these sorts of admonishments don’t help improve reliability, because they miss an essential truth about the nature of work in organizations.

One of the big ideas from resilience engineering is the efficiency-thoroughness trade-off, also known as the ETTO Principle. The ETTO principle was first articulated by Erik Hollnagel, one of the founders of the field. The idea is that there’s a fundamental trade-off between how quickly we can complete tasks, and how thorough we can be when working on each individual task. Let’s consider the work of doing software development using AI agents through the lens of the ETTO principle.

Coding agents like Claude Code and OpenAI are capable of automatically generating significant amounts of code. Honestly, it’s astonishing what these tools are capable of today. But like all LLMs, while they will always generate plausiblelooking output, they do not always generate correct output. This means that a human needs to check an AI agent’s work to ensure that it’s generating code that’s up to snuff: a human has to review the code generated by the agent.

Screenshot of asking Claude about coding mistakes. Not the permanent warning at the bottom.

As any human software engineer will tell you, reviewing code is hard. It takes effort to understand code that you didn’t write. And larger changes are harder to review, which means that the more work that the agent does, the more work the human in the loop has to do to verify it.

If the code compiles and runs and all tests pass, how much time should the human spend on reviewing it? The ETTO principle tells us there’s a trade-off here: the incentives push software engineers towards completing our development tasks more quickly, which is why we’re all adopting AI in the first place. After all, if it ends up taking just as long to review the AI-generated code as it would have for the human reviewer to write it from scratch, then that defeats the purpose of automating the development task to begin with.

Maybe at first we’re skeptical and we spend more time reviewing the agent code. But, as we get better at working with the agents, and as the AI models themselves get better over time, we’ll figure out where the trouble spots of AI-generated code tend to pop up, and we’ll focus our code review effort accordingly. In essence, we’re riding the ETTO trade-off curve by figuring out how much review effort we should be putting in to and where that effort should go.

Eventually, though, a problem with AI-generated code will slip through this human review process and will contribute to an incident. In the wake of this incident, the software engineers will be reminded that AI agents can make mistakes, and that they need to carefully review the generated code. But, as always, such reminders will do nothing to improve reliability. Because, while AI agents change way that software developers work, they don’t eliminate the efficiency-thoroughness trade-off.

Published October 12, 2025

Read Entire Article