CodeRabbit is one of the biggest consumers of reasoning models in the industry, burning through OpenAI’s o1, o3 mini, and Claude Sonnet 3.5 (and now Claude 4.0) at scale. But this isn’t founder and CEO Harjot Gill’s first startup, and he’s not just building yet another coding assistant. Instead, he’s tackling what he sees as software development’s most persistent bottleneck: the human code review process.
“The big bottleneck in shipping software is the code review process, especially as you grow your team,” Gill said in this On the Road episode of The New Stack Makers, recorded at Google Cloud Next.
“Pull requests begin to have a lot of ego clashes, a lot of discussion back and forth, until at some point you realize that it now takes a few weeks to ship even relatively minor changes.”
And here’s the kicker: that bottleneck has actually gotten worse as AI coding tools have proliferated.
In this episode of The New Stack Makers, Gill sat down with TNS Founder and Publisher Alex Williams to talk about solving code review bottlenecks by automating code review with large language models. The pair discussed how Code Rabbit handles all the different ways that shipping code — whether human or AI in origin — can go wrong.
The Paradox of AI-Generated Code
While tools like GitHub Copilot and Cursor have dramatically accelerated code generation, they’ve created an unexpected side effect.
“Human developers are now becoming reviewers of AI code,” Gill said. “But a lot of times you don’t know what’s going on in that code — especially now with the rise of vibe coding,” he noted, referring to the practice of starting with descriptive, high-level text prompts and asking AI to generate substantial chunks of code.
This shift has fundamentally changed the developer workflow. Instead of carefully crafting each line, developers are increasingly in the position of reviewing and understanding code they didn’t write — code that might contain subtle bugs, security vulnerabilities or architectural inconsistencies that only surface during the review phase.
CodeRabbit’s approach differs from traditional static analysis tools that rely on rigid rule-based systems with high false-positive rates. “These rule-based systems do not understand the intent of the code change,” Gill said. “When the human comes in, they are looking at the architecture. They are looking at the implication — they have a lot of tribal knowledge.”
Building Context at Scale
The AI equivalent of tribal knowledge is context — massive amounts of it. CodeRabbit creates sandbox environments for each pull request, cloning entire repositories and giving AI agents access to navigate codebases just like humans do.
The system can generate CLI commands, search through files, analyze abstract syntax trees, and even pull additional context from external sources like Jira tickets or security vulnerability databases. It’s an agentic approach that goes far beyond simple pattern matching.
But building reliable AI agents isn’t without challenges. “The errors compound,” Gill warned. “If you are hooking up multiple agents, your first level of agent, let’s say, 5% hallucination or errors. The next step is gonna see 10% because each step you go deeper in the chain, the worse these errors become.”
The Future of Code Review
Ultimately, Gill painted a picture of code review evolving from a purely human activity to a hybrid process where AI handles the heavy lifting of context gathering and initial analysis, while humans focus on higher-level architectural decisions and business logic validation.
The implications extend beyond just faster code reviews. As Gill noted, “almost every issue that you see can be traced back to some code change,” whether it’s a production deployment issue or a security vulnerability. By catching these issues earlier in the development cycle, automated code review systems could dramatically reduce the cost and complexity of software maintenance.
The full conversation dives deeper into the technical challenges of building reliable AI agents, the evolution of reasoning models, and how CodeRabbit’s approach to context gathering is pushing the boundaries of what’s possible with current LLM technology. For anyone interested in the intersection of AI and software development workflows, it’s a fascinating glimpse into how the tools we use to build software are themselves being rebuilt from the ground up.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube channel to stream all our podcasts, interviews, demos, and more.