System Design Interviewing Tips (2022)

3 months ago 2

System design interviews are inherently subjective. Outcomes depend on many factors, including the backgrounds of both the interviewer and the candidate. Even if both have experience in backend systems, their expertise often lies in different domains. It makes direct alignment almost impossible.

Over the years, I have conducted countless interviews. I’ve also run plenty of mock design sessions for mentees. You throw out a question like build Uber from scratch. Five minutes in, your interviewee has four microservices, one monolith, and a single‑region database that would strand every driver the moment traffic spikes. Then come the follow‑ups and all that crap.

I’ve been on the other side, too. More than once I’ve been asked to design a key/value store, scribbling back‑of‑the‑napkin math to figure out how to shard, replicate across the world, and keep data durable while it stays available. CAP theorem and all that. It’s all very real. The frustrating part is you only get 45 minutes to land an answer. And, it’s hard and sometimes frustrating.

What these rounds teach whether you’re the coach or the candidate is that good system design is simply a conversation about trade-offs. You’ve got to keep it human. Drifting into buzzword karaoke doesn’t help anyone. And another thing: if you say you’ll just use AWS or GCP for that, it doesn’t work like that, mate. Sure, you can leverage them, but what we’re after is the technology itself, not one cloud provider’s menu.

Here’s the cheat sheet I share with my students and mentees:

Start with the story. Who’s the user and what problem are we solving? If we can’t say it in one breath, no diagram will rescue us.
Surface the deal‑breakers. Speed, storage, privacy, name the hard constraints before they ambush us later.
Draw with intent. Every box costs money, and every arrow hides failure modes. If it doesn’t solve a real user need, ditch it.

But even with clear heuristics like these, system design interviews remain messy. No two conversations unfold the same way, because no two people carry the same background or biases.

This subjectivity makes system design interviews tricky, not just for candidates, but for interviewers too. The temptation is to rely on gut feeling or vague impressions of seniority leads to noisy signals and inconsistent evaluations. If the goal is to find engineers who can reason deeply about tradeoffs, structure complexity, and communicate clearly under pressure, then we need a better approach.

This post shares a few principles I’ve found helpful, especially for interviewers who want to improve the signal they get from these sessions and avoid bias traps. These aren’t universal rules, but field-tested practices that help me stay grounded when evaluating real-world thinking instead of whiteboard performance.

Score Card

Frameworks and reps teach us how to attack the whiteboard; the scorecard tells us how to judge it. Everyone has a picture of what good looks like, but on a panel we need the same picture. Otherwise we lose to inconsistency and end up chasing signals that don’t align. Honestly, it is sometimes barely better than a coin flip. When the loop ends, the panel must share one yard-stick, or we’ll just argue adjectives: strong, pretty good, kinda shaky.

So, before anyone sketches another box, I pull out my seven columns: communication, business sense, autonomy, core design chops, scalability math, ops sanity, and rollout planning.

Communication

Why does it matter? In a 45-minute round every second you and the candidate talk past each other is a second you can’t get back. I’ve watched brilliant engineers sink themselves because they assumed I shared their mental picture. Five minutes later we realized our user wasn’t even the same persona. A candidate who pauses, rephrases the goal, and checks the room keeps both brains locked on the same problem and buys extra time for depth instead of rework. It also shows me how they’ll run a design review on the job. Will they let quieter voices in, or bulldoze the call?

Good sign – Restates the user story in one breath, asks anything I’m missing before I draw? and leaves space for push back.
Red flag – Grabs the marker, starts drawing off boxes for five minutes, never pauses to confirm scope.

Business Acumen

Why does it matter? I’ve seen good architectures that never shipped because no one tied them to a number the business cared about. If the pick-up time creeps past two minutes, riders bail. You know no diagram can save that. When a candidate links each component to a KPI, I know they’ll argue for the feature that moves the metric. They aren’t looking for the perfect, ultimate king of available systems.

Good sign – P95 pickup has to stay under two minutes, so the dispatch service sits close to riders.
Red flag – Polishes an exotic edge-case before agreeing on what success even means.

Autonomy

Why does it matter? Nobody has so much time to spoon feed senior engineers. In production nobody whispers remember the cache while you’re on call at 2 a.m. I need hires who treat hints as a bonus, not a lifeline. Watching how a candidate explores trade-offs without much guidance tells me whether they’ll unblock themselves or ping me for every fork in the road.

Good sign – Offers two data flows, weighs cost vs. latency without a prompt.
Red flag – Waits for clues to mention replication, caching, or failure paths.

System Design

Why does it matter? Service boundaries are organizational scar tissue. If you cut in the wrong place and the oncall rota bleeds forever. I’ve paid the interest on those bad splits. Teams can have a war over ownership while outages are bouncing between them. A candidate who decomposes the system by capability and failure domain saves months of pager fatigue and politics.

Good sign – Breaks build Uber into auth, matching, pricing, realtime location; picks the right store for each workload; designs for failure.
Red flag – Draws one giant backend box and hand-waves durability to the database.

Efficiency & Scalability

Why does it matter? The average CFO doesn’t care if your architecture diagram looks like a work of art. They care that the bill didn’t spike 4x overnight. I want to see that a candidate can ballpark traffic, estimate load, and flag the crash before we hit the wall at 100 km/h.

Good sign – 100k req/s × 1KB = ~100MB/s throughput. So, I should design for that.
Red flag – Storage is infinite in the cloud, right?

Operational Excellence

Why does it matter? Fancy arrows don’t wake you at 3 a.m.; alerts do. The best design in the world is useless if the first sign of trouble is a VP on Slack. When a candidate brings up dashboards, rollout safety nets, and failure drills, I know they got paged before and plan to let me sleep.

Good sign – Calls out blue-green deploys, three must-have dashboards, and the alert path for 500 spikes.
Red flag – Focuses on infrastructure-level signals, ignoring application health and user experience

Planning

Why does it matter? I’ve seen bulletproof designs crumble because rollout was an afterthought. A well-planned, phased launch with kill switches and feedback loops turns a leap of faith into a safe staircase. Candidates who can map out milestones, canary releases, and iteration checkpoints show they know how to ship value step-by-step, not gamble it all on a big-bang deployment.

Good sign – Lays out a phased launch: pilot region, canary flags, feedback signals, rough team-weeks per phase.
Red flag – No mention of risk mitigation, rollback paths, or feedback

Leveling

I stick to three tiers: Mid-Senior, Senior, and Staff+. Anything past staff drifts into company-specific folklore, but these three tiers cover 95 % of loops I run. The higher the level, the less I care about tidy boxes and the more I look for force-multiplication. Can this person unblock others, future-proof the system, and keep the pager quiet? Here’s the expanded lens I use.

Mid-Senior

He comes up with a solution. He has some experience to solve the part of the problem. He explains the rest of the system based on the theory. He focuses on the implementation detail with less focus on the overall system design. He puts less thought into how a problem would be broken down across a team.

Gets something working, fast. Leans on hands-on experience for the slice they know, hand-waves the rest from theory.
Loves implementation details, thread pools, cache eviction. Spends little airtime on how services fit together or how work chops across a team.
Failure modes, capacity math, and rollout phases surface after a nudge. When prompted, answers are sound but rarely volunteered.
Mentions an MVP and a phase two yet hasn’t wrestled a multi-stage rollout in anger.
Can own a service end to end when the domain is familiar; needs a seasoned partner to cover blind spots in new territory.

Overall, you are looking at a dependable builder who still needs guard-rails when the problem spills beyond the box they’re drawing.

Senior

He comes up with a detailed solution. He has experience in building similar systems. He has a good theoretical knowledge of the parts he doesn’t have experience building. He has a strong sense of the design of the overall system and how this could be delivered by a team. The interviewer observes evidence of planning, scaling, optimization, and operations.

Starts with big-picture first, details second. Restates the business goal unprompted, slices the system by user journey, then dives into data structures and APIs.
Trades off fluently. Anchors every decision to the KPI that pays the bills; kills shiny features that don’t move the needle.
Takes failure cases into account from the beginning. Treats regional outages, network flakes, and thundering-herd spikes as first-class citizens, not post-it notes.
Runs back-of-the-napkin math out loud, requests per second, disk IOPS, p95 latency and adjusts sharding or caching when the numbers scream.
Talks SLIs/SLOs, dashboards, blue-green deploys, and an on-call rota without waiting for hint cards.
Sketches a three-stage launch, canary, city-pilot, full go with kill-switches and feedback loops.
Evaluates impact radius. Frees the manager to look ahead; junior engineers copy their playbook.

Overall, you are looking at an engineer who you trust to ship the system and own the pager without supervision.

Staff+

She comes up with a comprehensive solution that considers not only the current business case but also future/other business objectives. She has overseen and has experience in similar solutions. She has excellent practical experience backed by excellent theory. She proposes novel solutions, discusses multiple trade-offs, and anticipates the evolution of the systems. He has an appetite for scaling, reuse, optimization, and operational excellence.

Designs for the current ask and the pivot looming six months out, weaving in reuse, extensibility, and graceful evolution.
Spots cross-team coupling early, proposes platform primitives or contracts that let half the org move faster.
Sketches two or three credible architectures, lays out crystal-clear trade-offs, and drives consensus without exec air cover.
Builds load models for next year’s traffic, highlights the ceilings we’ll hit first, and budget-checks every new dependency.
Pushes blameless postmortems, chaos drills, and paved-road tooling so others inherit resilience by default.
Orchestrates multi-team, multi-quarter migrations; lines up owners, sequencing, and staffing before a single PR lands.
Unblocks seniors, up-levels whole teams, and bends the roadmap to find better ways of building.

Overall, you are looking at an engineer who sees around corners, scales both code and people, and makes everyone else’s job easier.

Training

One of the most effective ways to improve as an interviewer is to treat it like any other skill. It gets better through deliberate practice and structured feedback. Most engineers aren’t trained to evaluate others; we tend to mimic how we were interviewed ourselves. That’s a risky foundation. Without a shared mental model of what good looks like, teams drift into bias, inconsistency, and gut-feel verdicts that don’t hold up under scrutiny.

That’s why I recommend running calibration sessions with your team before using a system design question in real interviews. Pick a few questions you expect to use regularly. Say, design a ride-sharing app or build a distributed key-value store and run mock interviews with colleagues across different levels: Mid, Senior, Staff. This gives you a reference point for how engineers at each stage of growth tend to reason, what they prioritize, and where they shine or get stuck.

To structure these sessions, use a simple progression:

Start with paired practice. Take turns interviewing each other using a real system design question. Focus on realism. Do not role play an idealized candidate. If you’d genuinely ask for clarification or get stuck on a tradeoff, lean into it.
Shadow and reverse-shadow.
- In a shadowing session, one person interviews a candidate while others silently observe. Pay attention not just to the candidate’s answers, but to how the interviewer runs the session and how they probe, steer, and follow up.
- In reverse shadowing, the observer becomes the interviewer while the original interviewer watches. This flips the lens and makes it easier to give actionable feedback.
Debrief after each session. Take 10–15 minutes to compare scorecards, surface disagreements, and ask:
- Did we interpret the candidate’s approach the same way?
- Did we agree on the signal strength for each area?
- If there were differences, were they about interpretation or calibration?

Use those conversations to gradually refine your internal rubric. Keep a shared doc with examples of what Mid/Senior/Staff-level performance looks like for your most common questions. Update it as you see more patterns emerge. The more aligned your panel is, the fairer your interviews become and the more confident you’ll feel during hiring decisions.

When I’ve run this kind of structured training loop, I’ve seen teams go from noisy, inconsistent interviews to clean, high-signal sessions that leave no one arguing over adjectives like “solid” vs. “decent.” You don’t need weeks of prep. You just need a few deliberate cycles, some honest feedback, and a willingness to challenge your own assumptions.

Over time, this practice turns interviews from a guessing game into a craft. It truly helps you hire not just strong individuals, but the right ones for your context.

Evaluation

The interview doesn’t end when the candidate leaves. It ends when the panel aligns on what they saw and why it matters. That’s where evaluation comes in. The scorecard isn’t just a form to fill out; it’s a tool to focus discussion and reduce noise.

I have seen many interviewers who only try to fill in the scorecard. Filling a score card is not a conversation. The interview should be a challenging yet enjoyable conversation for both parties. We should take notes down but we shouldn’t be mechanical.

After the interview, I always circle back for a quick calibration: compare my scores with the rest of the panel, surface any big gaps, and reconcile them while the details are fresh. That two-minute huddle catches blind spots, keeps the rubric aligned, and ensures one interviewer’s pet peeve doesn’t derail an otherwise solid hire.

Final remarks

Interview performance depends on many different factors. One notable mention is the area. If we expect someone to work on databases, we should probably have a system design question that concentrates on databases. I don’t see the point of asking someone a machine learning question if they would work on databases. Thus, each organization should probably have its own set of interview questions.

All in all, I believe both interviewing someone or interviewing for a position is stressful. The interview performance doesn’t necessarily show if someone will be good at the role. I have seen it over and over again. Some people are just good at exams, interviews, and so forth. Some people stress a lot and need time to think. Hence, the interview process doesn’t define anyone’s success. It’s just a tool for hiring people.

Read Entire Article