Code Mode Agents: Let the LLM write code, not call tools

1 month ago 6

October 7, 2025

Traditional Tool Calling Agents:

Input → LLM → loop(Tool Call → Execute Tool → Result → LLM) → output

Claude Code Mode Agent:

Input → loop(Claude Codes w/ access to your tools) → Execute Code → output

Why it's better

Each time tool output is passed to the LLM, and output is introduced, it leaves some possibility of mistake. As the number of iterations your agent does n increases, the better Code Mode gets compared to traditional iterative tool calling agents.

LLMs are better at generating code files & verifying & running that code they generate than calling tools to create output.

How It Works

Let's say you have an Architect AI that is supposed to verify that blueprints created by the company all satisfy the required documentation before a product is shipped.

┌─────────────────────────────────────────────────────────────────┐ │ Your Application │ │ │ │ agent = Agent('claude-sonnet-4-5-20250929') │ │ │ │ @agent.tool │ │ def search_docs(query: str) -> list[database_rows]: ... │ │ │ │ @agent.tool │ │ def analyze_blueprints(file_location: str) -> dict: ... │ │ │ │ @agent.tool │ │ def report_blueprint(pdf_location: str) -> None: ... │ │ │ │ result = agent.codemode("verify structural integrity") │ │ │ │ └─────────────┼───────────────────────────────────────────────────┘ │ │ 1. Extract tools from agent ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Claude Codemode │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ 2. Generate agentRunner.py with tool definitions │ │ │ │ │ │ │ │ def search_docs(query: str) -> list[database_rows]: │ │ │ │ """Search documentation, returns database rows.""" │ │ │ │ "... implementation ..." │ │ │ │ │ │ │ │ def analyze_blueprints(file_location: str) -> dict: │ │ │ │ """Analyze blueprints for issues.""" │ │ │ │ "... implementation ..." │ │ │ │ │ │ │ │ def report_blueprint(pdf_location: str) -> None: │ │ │ │ """Report a blueprint for failing standards.""" │ │ │ │ "... implementation ..." │ │ │ │ │ │ │ │ def main(params: str = "verify structural integrity") │ │ │ │ "TODO: Implement task using above tools" │ │ │ │ pass │ │ │ └──────────────────────────────────────────────────────────┘ │ │ │ │ └──────────────────────────┼──────────────────────────────────────┘ │ 3. Spawn Claude Code ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Claude Code │ │ │ │ Instructions: │ │ "Implement the main() function to accomplish the task. │ │ Use the provided tools by writing Python code." │ │ │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ Claude reads agentRunner.py │ │ │ │ Claude writes implementation: │ │ │ │ │ │ │ │ def find_database_files_on_disk(database_row): │ │ │ │ "Claude wrote this helper function!" │ │ │ │ return database_row['file'].download().path │ │ │ │ │ │ │ │ def main(): │ │ │ │ "Search docs - returns database rows" │ │ │ │ database_rows = search_docs("structural integrity")│ │ │ │ │ │ │ │ "Analyze each blueprint and report failures" │ │ │ │ analyses = [] │ │ │ │ for row in database_rows: │ │ │ │ file_path = find_database_files_on_disk(row) │ │ │ │ result = analyze_blueprints(file_path) │ │ │ │ analyses.append(result) │ │ │ │ if not result.passes: │ │ │ │ report_blueprint(file_path) │ │ │ │ │ │ │ │ return {"analyses": analyses} │ │ │ └────────────────────────────────────────────────────────┘ │ │ │ │ │ │ 4. Execute agentRunner.py │ │ ▼ │ │ ┌────────────────────────┐ │ │ │ python agentRunner.py │ │ │ └────────────────────────┘ │ │ │ │ └──────────────────────────┼──────────────────────────────────────┘ │ 5. Return result ▼ ┌─────────────────────────────────────────────────────────────────┐ │ CodeModeResult │ │ │ │ { │ │ "output": {...}, │ │ "success": true, │ │ "execution_log": "..." │ │ } │ └─────────────────────────────────────────────────────────────────┘

More Specifically Why It's Better

This problem has a lot of elements of what an agent excels at: some ambiguity, needing to integrate pieces together, formatting a nice output.

As the amount of functions needed to be called by the agents becomes very large, then the amount of advantage that the Code Mode Agent has keeps increasing. The best LLM agents today can call tools in parallel, but it's difficult for them to call > 10 tools in parallel. They can call tools in serially, but after 20 iterations, they lose track of where they are even with a todo list.

With Code Mode Agent, you can call an essentially unlimited amount of functions provided to you by the original agent definition, without waiting for a ridiculous number of agent iterations because of the serial or parallel limitations of agents.

My open source library that creates code mode agents, claude_codemode

https://github.com/ryx2/claude-codemode

from pydantic_ai import Agent from claude_codemode import codemode # Create an agent with tools agent = Agent('claude-sonnet-4-5-20250929') @agent.tool def get_weather(city: str) -> str: """Get weather for a city.""" return f"Weather in {city}: Sunny, 72°F" @agent.tool def calculate_temp_diff(temp1: str, temp2: str) -> str: """Calculate temperature difference.""" import re t1 = int(re.search(r'(\d+)', temp1).group(1)) t2 = int(re.search(r'(\d+)', temp2).group(1)) return f"{abs(t1 - t2)}°F" # Use codemode instead of run result = codemode( agent, "Compare weather between San Francisco and New York" ) print(result.output)