Why Semantic HTML Still Matters

3 months ago 2

21st July, 2025

Somewhere along the way, we forgot how to write HTML – or why it mattered in the first place.

Modern development workflows prioritise components, utility classes, and JavaScript-heavy rendering. HTML becomes a byproduct, not a foundation.

And that shift comes at a cost – in performance, accessibility, resilience, and how machines (and people) interpret your content.

I’ve written elsewhere about how JavaScript is killing the web. But one of the most fixable, overlooked parts of that story is semantic HTML.

This piece is about what we’ve lost – and why it still matters.

Semantic HTML is how machines understand meaning

HTML isn’t just how we place elements on a page. It’s a language – with a vocabulary that expresses meaning

Tags like <article>, <nav> and <section> aren’t decorative. They express intent. They signal hierarchy. They tell machines what your content is, and how it relates to everything else.

Search engines, accessibility tools, AI agents, and task-based systems all rely on structural signals – sometimes explicitly, sometimes heuristically. Not every system requires perfect markup, but when they can take advantage of it, semantic HTML can give them clarity. And in a web full of structurally ambiguous pages, that clarity can be a competitive edge.

Semantic markup doesn’t guarantee better indexing or extraction – but it creates a foundation that systems can use, now and in the future. It’s a signal of quality, structure, and intent.

If everything is a <div> or a <span>, then nothing is meaningful.

It’s not just bad HTML – it’s meaningless markup

It’s easy to dismiss this as a purity issue. Who cares whether you use a <div> or a <section>, as long as it looks right?

But this isn’t about pedantry. Meaningless markup doesn’t just make your site harder to read – it makes it harder to render, harder to maintain, and harder to scale.

This kind of abstraction leads to markup that often looks like this:

<div class="tw-bg-white tw-p-4 tw-shadow tw-rounded-md"> <div class="tw-flex tw-flex-col tw-gap-2"> <div class="tw-text-sm tw-font-semibold tw-uppercase tw-text-gray-500">ACME Widget</div> <div class="tw-text-xl tw-font-bold tw-text-blue-900">Blue Widget</div> <div class="tw-text-md tw-text-gray-700">Our best-selling widget for 2025. Lightweight, fast, and dependable.</div> <div class="tw-mt-4 tw-flex tw-items-center tw-justify-between"> <div class="tw-text-lg tw-font-bold">$49.99</div> <button class="tw-bg-blue-600 tw-text-white tw-px-4 tw-py-2 tw-rounded hover:tw-bg-blue-700">Buy now</button> </div> </div> </div>

Sure, this works. It’s styled. It renders. But it’s semantically dead.

It gives you no sense of what this content is. Is it a product listing? A blog post? A call to action?

You can’t tell at a glance – and neither can a screen reader, a crawler, or an agent trying to extract your pricing data.

Here’s the same thing with meaningful structure:

<article class="product-card"> <header> <p class="product-brand">ACME Widget</p> <h2 class="product-name">Blue Widget</h2> </header> <p class="product-description">Our best-selling widget for 2025. Lightweight, fast, and dependable.</p> <footer class="product-footer"> <span class="product-price">$49.99</span> <button class="buy-button">Buy now</button> </footer> </article>

Now it tells a story. There’s structure. There’s intent. You can target it in your CSS. You can extract it in a scraper. You can navigate it in a screen reader. It means something.

Semantic HTML is the foundation of accessibility. Without structure and meaning, assistive technologies can’t parse your content. Screen readers don’t know what to announce. Keyboard users get stuck. Voice interfaces can’t find what you’ve buried in divs. Clean, meaningful HTML isn’t just good practice – it’s how people access the web.

That’s not to say frameworks are inherently bad, or inaccessible. Tailwind, atomic classes, and inline styles can absolutely be useful – especially in complex projects or large teams where consistency and speed matter. They can reduce cognitive overhead. They can improve velocity.

But they’re tools, not answers. And when every component devolves into a soup of near-duplicate utility classes – tweaked for every layout and breakpoint – you lose the plot. The structure disappears. The purpose is obscured.

This isn’t about abstraction. It’s about what you lose in the process.

And that loss doesn’t just hurt semantics – it hurts performance. In fact, it’s one of the biggest reasons the modern web feels slower, heavier, and more fragile than ever.

Semantic rot wrecks performance

We’ve normalised the idea that HTML is just a render target – that we can throw arbitrary markup at the browser and trust it to figure it out. And it does. Browsers are astonishingly good at fixing our messes.

But that forgiveness has a cost.

Rendering engines are designed to be fault-tolerant. They’ll infer roles, patch up bad structure, and try to render things as you intended. But every time they have to do that – every time they have to guess what your <div> soup is trying to be – it costs time. That’s CPU cycles. That’s GPU time. That’s power, especially on mobile.

Let’s break down where and how the bloat hits hardest – and why it matters.

Big DOMs are slow to render

Every single node in the DOM adds overhead. During rendering, the browser walks the DOM tree, builds the CSSOM, calculates styles, resolves layout, and paints pixels. More nodes mean more work at each stage.

It’s not just about download size (though that matters too – more markup means more bytes, and potentially less efficient compression). It’s about render performance. A bloated DOM means longer layout and paint phases, more memory usage, and higher energy usage.

Even simple interactions – like opening a modal or expanding a list – can trigger reflows that crawl through your bloated DOM. And suddenly your “simple” page lags, stutters, or janks.

You can see this in Chrome DevTools. Open the Performance tab, record a trace, and watch the flame chart light up every time your layout engine spins it’s wheels.

Fun fact: parsing isn’t the bottleneck — browsers like Chromium can process HTML at tens of GB/s on modern CPUs. The real cost comes during CSSOM construction, layout, paint, and composite. Also, HTML parsing is blocking only when you hit a non-deferred <script> or unlinked stylesheet – which again underscores why clean markup still matters, but you also need smart loading order.

Complex trees cause layout thrashing

But it’s not just about how much markup you have – it’s about how it’s structured. Deep nesting, wrapper bloat, and overly abstracted components create DOM trees that are hard to reason about and costly to render. The browser has to work harder to figure out what changes affect what – and that’s where things start to fall apart.

Toggle a single class, and you might invalidate layout across the entire viewport. That change cascades through parent-child chains, triggering layout shifts and visual instability. Components reposition themselves unexpectedly. Scroll anchoring fails, and users lose their position mid-interaction. The whole experience becomes unstable.

And because this all happens in real time – on every interaction – it hits your frame budget. Targeting 60fps? That gives you just ~16ms per frame. Blow that budget, and users feel the lag instantly.

You’ll see it in Chrome’s DevTools – in the “Layout Shift Regions” or in the “Frames” graph as missed frames stack up.

When you mutate the DOM, browsers don’t always re-layout the whole tree – there’s incremental layout processing. But deeply nested or ambiguous markup still triggers expensive ancestor checks. Projects like Facebook’s “Spineless Traversal” show that browsers still pay a performance penalty when many nodes need checking.

Redundant CSS increases recalculation cost

A bloated DOM is bad enough – but bloated stylesheets make things even worse.

Modern CSS workflows – especially in componentised systems – often lead to duplication. Each component declares its own styles – even when they repeat. There’s no cascade. No shared context. Specificity becomes a mess, and overrides are the default.

For example, here’s what that often looks like:

/* button.css */ .btn { background-color: #006; color: #fff; font-weight: bold; } /* header.css */ .header .btn { background-color: #005; } /* card.css */ .card .btn { background-color: #004; }

Each file redefines the same thing. The browser has to parse, apply, and reconcile all of it. Multiply this by hundreds of components, and your CSSOM – the browser’s internal model of all CSS rules – balloons.

Every time something changes (like a class toggle), the browser has to re-evaluate which rules apply where. More rules, more recalculations. And on lower-end devices, that becomes a bottleneck.

Yes, atomic CSS systems like Tailwind can reduce file size and increase reuse. But only when used intentionally. When every component gets wrapped in a dozen layers of utility classes, and each utility is slightly tweaked (margin here, font there), you end up with thousands of unique combinations – many of which are nearly identical.

The cost isn’t just size. It’s churn.

Browsers match selectors from right to left (e.g., for div.card p span, they check → parent → etc). This is efficient for clear, specific selectors – but bloated deep trees or generic cascading rules force lots of overs canning.

Autogenerated classes break caching and targeting

It’s become common to see class names like .sc-a12bc, .jsx-392hf, or .tw-abc123. These are often the result of CSS-in-JS systems, scoped styles, or build-time hashing. The intent is clear: localise styles to avoid global conflicts. And that’s not a bad idea.

But this approach comes with a different kind of fragility.

If your classes are ephemeral – if they change with every build – then:

Your analytics tags break.
Your end-to-end tests need constant maintenance.
Your caching strategies fall apart.
Your markup diffs become unreadable.
And your CSS becomes non-reusable by default.

From a performance perspective, that last point is critical. Caching only works when things are predictable. The browser’s ability to cache and reuse parsed stylesheets depends on consistent selectors. If every component, every build, every deployment changes its class names, the browser has to reparse and reapply everything.

Worse, it forces tooling to rely on brittle workarounds. Want to target a button in your checkout funnel via your tag manager? Good luck if it’s wrapped in three layers of hashed components.

This isn’t hypothetical. It’s a common pain point in modern frontend stacks, and one that bloats everything – code, tooling, rendering paths.

Predictable, semantic class names don’t just make your life easier. They make the web faster.

Semantic tags can provide layout hints

Semantic HTML isn’t just about meaning or accessibility. It’s scaffolding. Structure. And that structure gives both you and the browser something to work with.

Tags like <main>, <nav>, <aside>, and <footer> aren’t just semantic – they’re block-level by default, and they naturally segment the page. That segmentation often lines up with how the browser processes and paints content. They don’t guarantee performance wins, but they create the conditions for them.

When your layout has clear boundaries, the browser can scope its work more effectively. It can isolate style recalculations, avoid unnecessary reflows, and better manage things like scroll containers and sticky elements.

More importantly: in the paint and composite phases, the browser can distribute rendering work across multiple threads. GPU compositing pipelines benefit from well-structured DOM regions – especially when they’re paired with properties like contain: paint or will-change: transform. By creating isolated layers, you reduce the overhead of re-rasterising large portions of the page.

If everything is a giant stack of nested <div>s, there’s no clear opportunity for this kind of isolation. Every interaction, animation, or resize event risks triggering a reflow or repaint that affects the entire tree. You’re not just making it harder for yourself – you’re bottlenecking the rendering engine.

Put simply: semantic tags help you work with the browser instead of fighting it. They’re not magic, but they make the magic possible.

Animations and the compositing catastrophe

Animations are where well-structured HTML either shines… or fails catastrophically.

Modern browsers aim to offload animation work to the GPU. That’s what enables silky-smooth transitions at 60fps or higher. But for that to happen, the browser needs to isolate the animated element onto its own compositing layer. Only certain CSS properties qualify for this kind of GPU-accelerated treatment – most notably transform and opacity.

If you animate something like top, left, width, or margin, you’re triggering the layout engine. That means recalculating layout for everything downstream of the change. That’s main-thread work, and it’s expensive.

On a simple page? Maybe you get away with it.

On a deeply nested component with dozens of siblings and dependencies? Every animation becomes a layout thrash. And once your animation frame budget blows past 16ms (the limit for 60fps), things get janky. Animations stutter. Interactions lag. Scroll becomes sluggish.

You can see this in DevTools’ Performance panel – layout recalculations, style invalidations, and paint operations lighting up the flame chart.

Semantic HTML helps here too. Proper structural boundaries allow for more effective use of modern CSS containment strategies:

contain: layout; tells the browser it doesn’t need to recalculate layout outside the element.

will-change: transform; hints that a compositing layer is needed.

isolation: isolate; and contain: paint; can help prevent visual spillover and force GPU layers.

But these tools only work when your DOM is rational. If your animated component is nested inside an unpredictable pile of generic <div>s, the browser can’t isolate it cleanly. It doesn’t know what might be affected – so it plays it safe and recalculates everything.

That’s not a browser flaw. It’s a developer failure.

Animation isn’t just about what moves. It’s about what shouldn’t.

Rendering and painting are parallel operations in modern engines. But DOM/CSS changes often force main-thread syncs, killing that advantage.

CSS layering via will-change: transform or the newer layer() syntax tells the GPU to handle composites separately. That avoids layout and paint in the main thread – but only when the DOM structure allows distinct layering containers.

CSS containment and visibility: powerful, but fragile

Modern CSS gives us powerful tools to manage performance – but they’re only effective when your HTML gives them room to breathe.

Take contain. You can use contain: layout, paint, or even size to tell the browser “don’t look outside this box – nothing in here affects the rest of the page.” This can drastically reduce the cost of layout recalculations, especially in dynamic interfaces.

But that only works when your markup has clear structural boundaries.

If your content is tangled in a nest of non-semantic wrappers, or if containers inherit unexpected styles or dependencies, then containment becomes unreliable. You can’t safely contain what you can’t isolate. The browser won’t take the risk.

Likewise, content-visibility: auto is one of the most underrated tools in the modern CSS arsenal. It lets the browser skip rendering elements that aren’t visible on-screen – effectively “virtualising” them. That’s huge for long pages, feeds, or infinite scroll components.

But it comes with caveats. It requires predictable layout, scroll anchoring, and structural coherence. If your DOM is messy, or your components leak styles and dependencies up and down the tree, it backfires – introducing layout jumps, rendering bugs, or broken focus states.

These aren’t magic bullets. They’re performance contracts. And messy markup breaks those contracts.

Semantic HTML – and a clean, well-structured DOM – is what makes these tools viable in the first place.

MDN’s docs highlight how contain: content (shorthand for layout+paint+style) lets browsers optimize entire subtrees independently
Real-world A/B tests show INP latency improvements on e‑commerce pages using content-visibility: auto.

Agents are the new users – and they care about structure

The web isn’t just for humans anymore.

Search engines were the first wave – parsing content, extracting meaning, and ranking based on structure and semantics. But now we’re entering the era of AI agents, assistants, scrapers, task runners, and LLM-backed automation. These systems don’t browse your site. They don’t scroll. They don’t click. They parse.

They look at your markup and ask:

What is this?
How is it structured?
What’s important?
How does it relate to everything else?

A clean, semantic DOM answers those questions clearly. A soup of <div>s does not.

And when these agents have to choose between ten sites that all claim to sell the same widget, the one that’s easier to interpret, extract, and summarise will win.

That’s not hypothetical. Google’s shopping systems, summarisation agents like Perplexity, AI browsers like Arc, and assistive tools for accessibility are all examples of this shift in motion. Your site isn’t just a visual experience anymore – it’s an interface. An API. A dataset.

If your markup can’t support that? You’re out of the conversation.

And yes – smart systems can and do infer structure when they have to. But that’s extra work. That’s imprecise. That’s risk.

In a competitive landscape, well-structured markup isn’t just an optimisation – it’s a differentiator.

Structure is resilience

Semantic HTML isn’t just about helping machines understand your content. It’s about building interfaces that hold together under pressure.

Clean markup is easier to debug. Easier to adapt. Easier to progressively enhance. If your JavaScript fails, or your stylesheets don’t load, or your layout breaks on an edge-case screen – semantic HTML means there’s still something usable there.

That’s not just good practice. It’s how you build software for the real world.

Because real users have flaky connections. Real devices have limited power. Real sessions include edge cases you didn’t test for.

Semantic markup gives you a baseline. A fallback. A foundation.

Structure isn’t optional

If you want to build for performance, accessibility, discoverability, or resilience – if you want your site to be fast, understandable, and adaptable – start with HTML that means something.

Don’t treat markup as an afterthought. Don’t let your tooling bury the structure. Don’t build interfaces that only work when the stars align and the JavaScript loads.

Semantic HTML is a foundation. It’s fast. It’s robust. It’s self-descriptive. It’s future-facing.

It doesn’t stop you using Tailwind. It doesn’t stop you using React. But it does ask you to be deliberate. To design your structure with intent. To write code that tells a story – not just to humans, but to browsers, bots, and agents alike.

This isn’t nostalgia. This is infrastructure.

And if the web is going to survive the next wave of complexity, automation, and expectation – we need to remember how to build it properly.

That starts with remembering how to write HTML – and why we write it the way we do. Not as a byproduct of JavaScript, or an output of tooling, but as the foundation of everything that follows.

Read Entire Article