Building Systems with a Stroke of a Pen

2 weeks ago 1

Recently, my programming workflow feels more like sketching out a system on draft board with an intelligent pen than mucking around with individual units of code. I grab the right pen, or LLM generative AI, depending on what I’m sketching, whether it calls for broad strokes or light detailing scratches, and draft out the platform.

I believe this new abstraction in software engineering will move engineers up a layer to become architects and managers of coding agents. New engineering jobs will flourish in this shifting paradigm, as they must still retain control of the “pens” they sketch with, naturally leading to more complexity at this new baseline.

Over the past two and a half years, I’ve been building a cross-platform marketplace with a team of LLMs to disrupt a specific industry. I’ve learned a lot from working day to day with these models, all of which have unique skillsets and technical capabilities that I orchestrate together to build relatively clean and quickly.

They are my interns and junior colleagues. They are over confident in their abilities and need careful management. I’ve learned to wield them like any tool.

Some handle large context windows better (understanding vast amounts of information), while others are stronger at solving the actual problem. None are skilled at system architecture or project management, so my day-to-day job is the careful orchestration of sliced-up, narrowly defined tasks for each LLM, and optimized to decrease time to solution success to keep us on schedule.

I was first inspired to quit my comfy, albeit stressful tech job as a principal engineer / team lead when ChatGPT 3.5 was released in November 2022. This tipping point in generative AI made it now possible to build a platform my co-founder and I had been planning out for several years.

We chose to bootstrap with our own savings rather than raise outside funding. As much as I wanted to hire developers, it was simply not affordable for us when trying to keep burn rate super low.

After much trial and error and headbashing rate limits, I found some cadence with 3.5, and even more so with 4.0 and beyond. Eventually, I incorporated more models, types of “pens”, into my workflow, such as Gemini and Qwen.

I’ve tried the embedded IDEs like Cursor, but none feel trustworthy beyond autocomplete. After a certain level of code complexity and depth, these agents can’t be trusted running hands off.

I assign different pieces of code and requirements to LLMs of varying “skill levels,” based on their likelihood of producing a correct answer in a single pass, which maximizes time savings.

I use a combination of various models depending on context window size, level of difficulty, etc, as each LLM is better at certain tasks. This is equivalent to assigning specific areas of work to engineers based on their individual qualities.

For nasty bugs that require a frontier model, but also a large context window, I distill vast data with Gemini 2.5 Pro’s 1-million token window into an overview, then feed that context to the smarter model. Other times, letting the LLM auto-iterate on adding telemetry that feeds back to the model resolves the issue quickest.

During assemble I copy paste from each LLM’s chat window and perform final spot checks. I’ve developed an intuition for spotting flawed or overconfident LLM logic, memory entropy, and other LLM bugs before integration.

The complexity of building across iOS, Android, web, backend, and DevOps means it takes too long for me to context-switch into each deeply specialized domain. Therefore, if I can help it, I try to draw with my pens above the baseline complexity and only zoom down into the code below if I hit a snag.

I don’t believe in the idea of vibe coding if you’re trying to build something novel or technically challenging. It’s simply not possible to develop a cohesive system without being actively at the helm, mindfully sketching, and steering the production of platform architecture.

In all, this has helped me as a single engineer / architect to build a massive, feature rich platform in a fraction of the time doing it solo or even with in-house or outsourced developers.

The greatest challenge for humans is to resist the temptation of letting AI steer and control thought, with the natural tendency for our minds to calm under lessened pressure. The smarter and more confident AI becomes, the more critical it is for us to remain engaged and steering the wheel.

If one person can build a platform, this does not mean that developers’ jobs will go away. Like an architect drafting out a building with the stroke of a pen, there will be whole departments of engineers with their own pens building even more complexity on top of this new baseline.