Testing Sonnet/Opus vs. GPT-5 vs. Code Supernova on real coding tasks

2 hours ago 2

Code Supernova, a stealth model, has launched for free through Kilo Code with a 200k context window and no rate limits.

With limited documentation available, we ran systematic tests to understand its capabilities.

Key finding: Supernova generates complete code 6-10x faster than GPT-5.

However, this speed improvement comes with trade-offs in code structure and architecture, enabling new development workflows. Additionally, it nearly matched Sonnet 4’s UI design quality, one of Sonnet’s strongest capabilities.

Here’s what the testing revealed about where it fits best.

We tested Supernova in Kilo Code against other frontier models using real-world challenges:

Frontend: Build a production-ready landing page
Backend: Create a SQLite job queue with concurrency handling
Measurement criteria: Speed, code quality, architecture, edge case handling

The results reveal a model built for a completely different purpose than GPT-5 or Opus 4.1.

This speed differential enables specific use cases: rapid iteration cycles.

In a few minutes with Supernova, you can:

Generate an initial approach
Hit an error, feed it back to the model
Get fixed version
Realize you need different architecture
Generate new approach
Refine and polish

While in the same timeframe, GPT-5 is still thinking about its first response.

Why this speed difference? Supernova is built as an execution model, not a planning model.

While GPT-5 spends time reasoning through architecture and edge cases, Supernova does exactly what you ask, nothing more, nothing less.

This makes it incredibly fast when you already know what you want.

The prompt: Build a Postgres hosting landing page with a hero section (”Professional Postgres Hosting, Simplified”), key features (auto-scaling, high availability, security), pricing tiers (Hobby/Professional/Enterprise), trust elements, using deep navy/blue colors and a flat design.

Supernova’s result (17 seconds):

Fully functional React landing page
Visually polished with Tailwind CSS
Added “Most Popular” pricing badges without being asked
Professional layout matching Sonnet 4’s output

The result: One massive 400-line component. No modularity. Copy-pasted sections.

The code works, but it could be difficult to maintain.

Here’s a breakdown of how Supernova compares to Opus 4.1, Sonnet 4, and GPT-5 across various AI capabilities:

Architecture: GPT-5 > Opus 4.1 > Sonnet 4 > Supernova
Speed: Supernova > Sonnet 4 > Opus 4.1 > GPT-5
Visual Output: Supernova matched Sonnet 4’s design, while GPT-5 and Opus 4.1 delivered more polished results

The prompt: Implement a queue in TypeScript using better-sqlite3 for persistence, with optional delay timestamps for job scheduling

Supernova delivered this in 20 seconds:

Worker pool implementation
Basic job processing
Retry counter

Supernova missed:

Transaction rollbacks
Job unlocking on failure
Cleanup mechanisms
Proper error propagation

GPT-5 spent over 3 minutes on this problem before generating code. However, its output was much more robust and production-ready.

The GPT-5 output used database transactions for atomic operations, implemented visibility timeouts to prevent lost jobs, and separated concerns with dedicated ack(), fail(), and release() methods.

Each job reservation was wrapped was a transaction that either completed fully or rolled back, preventing the race conditions present in Supernova’s approach.

The backend test revealed distinct trade-offs between the models across three critical dimensions.

Robustness: GPT-5 > Opus 4.1 > Sonnet 4 > Supernova
Speed: Supernova > Sonnet 4 > Opus 4.1 > GPT-5
Production Readiness: GPT-5 = Opus 4.1 > Sonnet 4 > Supernova

Testing reveals Supernova’s training cutoff at September 2024, matching GPT-5 but six months behind Sonnet 4 and Opus 4.1 (March 2025).

What Supernova doesn’t know:

Next.js 15 features
React 19 updates
Latest TypeScript syntax
Tailwind v4 classes

You’ll get working code using older patterns. Fine for prototypes, but requires updates for cutting-edge features.

After running about a dozen tests, a clear pattern emerged: Supernova is an execution model, not a planning model.

Where frontier models like GPT-5 excel at reasoning through problems, planning architectures, and considering edge cases, Supernova excels at rapid execution of clear instructions. It does exactly what you ask, without overthinking or over-engineering.

This makes Supernova perfect as a complement to planning models:

UI Component Generation
Creating initial layouts, form designs, and dashboard mockups where visual output matters more than modularity.
API Testing and Integration
Quickly generating test clients, webhook handlers, and API endpoint prototypes to verify integration approaches.
Proof of Concepts
Building functional demos to validate technical feasibility before investing in proper architecture.
Quick Feature Additions
Adding simple features like modals, tooltips, or notification systems to existing codebases.
Static Page Generation
Creating landing pages, marketing sites, and documentation pages where maintenance isn’t a primary concern.

Supernova is (Still) Not Ideal For:

Production Systems: Testing shows limited error boundaries and monitoring hooks
Team Codebases: Single-file components make code review challenging
Safety-Critical Code: Missing defensive programming patterns in test outputs
Complex State Management: Simplified state handling compared to other models

Kilo Code’s model switching enables a workflow where you:

Generate multiple UI prototypes with Supernova
Select best approach based on visual output
Refactor selected prototype with GPT-5 for production
Result: Production-ready code in less time

Code Supernova is free in Kilo Code right now. Here’s the fastest path to trying it:

Install Kilo Code (if you haven’t already)
- VS Code Marketplace
- JetBrains Plugin
Switch to Supernova
- Click model selector (under the prompt box)
- Choose “Code Supernova”

Try This Test:
- Ask: “Build a landing page for ...”
- Time how long it takes
- Then try the same prompt with GPT-5
The speed difference: 8-10x faster generation.

There are 3 key insights here:

Insight #1: Supernova delivers working code 6-10x faster than GPT-5, but without proper architecture or production safeguards. This speed comes from being an execution model rather than a planning model. It does exactly what you ask without spending time on reasoning or architecture decisions.

Insight #2: Supernova nearly matches Sonnet 4’s UI design quality. Sonnet has been the go-to model for frontend work. Now you get the same visual results at 3x the speed.

Insight #3: The execution-focused approach makes Supernova ideal for specific parts of the development workflow:

Use GPT-5/Opus to plan architecture and identify edge cases
Use Supernova to rapidly execute on those plans
Switch back to GPT-5 for production refactoring

This isn’t about replacing your primary coding model. It’s about having fast execution when you already know what needs to be built. The 6-10x speed improvement makes it valuable for rapid iterations, prototyping, and implementing straightforward features where planning has already been done.

Code Supernova is currently available for free in Kilo Code with a 200k context window and no rate limits. No pricing has been announced for when the free period ends.