A while ago, a prominent Vercel employee (two, actually) posted to the tune of:
Developers don’t get CDNs
Exhibit A etc.
It is often that random tweets somehow get me into a frenzy – somebody is wrong on the internet, yet again. But when I gave this a second thought, I figured that… this statement has more merit than I would have wanted it to have.
It has merit because we do not know the very basics of cache control that are necessary (and there are not that many)!
It does not have merit in the sense that force-prefetching all of your includes through Vercel’s magic RSC-combine will not, actually, solve all your problems. They are talking in solutions that they sell, and what they are not emphasizing is that the issue is with the “developer slaps ‘Cache-Control’” part. Moreover: as I will explain, a lot of juice can be squeezed out of you by CDN providers exactly because your cache control is not in order and they offer you tools that kind of “force” your performance back into a survivable state. With some improvement for your users, and to the detriment of your wallet. But first, let’s rewind and see what those CDNs actually do.
CDNs use something called “conditional GET requests”. Conditional GET requests mean: Cache-Control. And even I, in my hubris, haven’t been using it correctly. After reviewing how it worked on a few of my own sites, I have overhauled my uses – and built up a “minimum understanding” of it which has been, to say the least, useful.
So, there it is: the absolute bare minimum of Cache-Control knowledge you may need for a public, mostly-static (CMS-driven, let’s say) website. Strap in, this is going to be wild.
And be mindful of one thing: I do not work for Vercel, CloudFlare, AWS or Fastly. I just like fast websites and I think you deserve to have your website go fast as well.
What are web caches, anyway?
Put simply: web caches ensure that once you have downloaded something, you get to keep it as long as it doesn’t change. Or, at the very least, you will be downloading it from someplace that is faster - and closer to you - than the website you are originally trying to access.
Your browser has a cache (this is your private cache, just yours). Then there can be “intermediate” caches, which we are going to call “caching proxies”. At the very end of that chain there is the actual website you are downloading from - the origin.
If you are running the origin website - it is your responsibility to use those caches to the max - and, if possible, running one of your own. As a matter of fact, we are going to do just that.
In the times of yore, when dinosaurs roamed the earth and shared web hosting was commonplace - caching proxies were actually much more pervasive than they are now. To begin with, many hosting providers would forcibly install something called squid in front of all websites hosted by them. No “ifs”, no “buts” - if you were hosting with a certain provider, they would stick a caching proxy in the middle regardless of whether you wanted it or not, whether it was beneficial for use case - or not.
Next, company intranets were a thing and internet access (especially: fast internet access) was expensive. Doubly so - for business users, and in offices. So - in addition to the caching proxies installed by hosting providers - you would also have the “internal” caching proxies. Remember the Internet Security and Acceleration Server from Microsoft? Well, the “Acceleration” part was about doing caching proxying for your entire office.
And then, with pervasive broadband - and both the more widespread deployment of SSL and the emergence of virtual machine based hosting - those caching proxies kind of faded out of the picture, except for the use cases of the biggest, most frequented web resources on the planet. The Akamai CDN is old, and Apple has been using Akamai since… 1999? What is Akamai, you may ask? Well… it is a caching proxy!
A CDN versus a caching proxy
For ease of understanding I defined it like this: a CDN is a sophisticated caching proxy with multiple caching nodes. An ideal CDN would have multiple tiers of caches, and have those caches geographically distributed to provide multiple POEs (“points of entry”) for users in different locations, with the “hottest” (most frequently accessed) pages being available in the caches closest to the user.
An ultimate “distribution strategy” would be something similar to where Netflix and YouTube have gone. They install “caching appliances” (caching web proxy servers, essentially) in hardware form, right at the ISPs data center - so that when you go to watch the latest episode of Adolescence you get it quickly and without your ISP having to go fetch it from Netflix - as your neighbor has already watched it yesterday, and it is in the ISP’s caching appliance Netflix has given them.
So, for a web developer like myself and you: a CDN is a caching proxy, just a fairly sophisticated one. The more sophisticated - the more “tricks” you can employ with it. For example, edge functions are basically scripts/small web apps which run on the CDN edge nodes, and can route, restrict or splice requests before they get to your web application.
But for 80% of use cases - or more - you can get by without any of that. No fancy Vercel prefetch, no fancy DOs, no edge functions - just the good, old Cache-Control. But it is a fickle mistress and you have to hold it right.
For static files, cache control is usually good already
How is it that we often don’t need to think about Cache-Control at all? Well, all of the modern webservers - no exception - are quite adept at setting cache-friendly headers for static files. It is not difficult to do, and since they do it so well - for most users this is actually fully transparent - including you.
But once the output comes from your web application - a good webserver will step aside and not touch your Cache-Control at all. If it does - either the server is misconfigured by mistake, or it is deliberately misconfigured to force caching on you - by a nefarious systems administrator at your hosting provider (or a semi-hostile devops team) to save costs. This is an interesting topic which we will revisit.
Learning with your own toys
I firmly believe that you can place little trust in things you can’t run locally. At the very minimum a faithful reproduction is also good enough, provided it works well. For Ruby web apps, there happens to be an old-but-good caching proxy solution which actually embeds inside your Rails (or Rack) application. It will intercept incoming requests, examine the responses you return to the client - and cache them as headers dictate. So, let’s put together a bare-bones Rack application which serves us a cheeky little piece of content:
We will also need another piece of kit which I have been using for ages, and it is a curl shell alias. A shell alias is just simpler to set up, you can use an HTTP client call if you prefer that. The alias goes like this:
and you run it like this:
It does something very simple: makes a GET request (not a HEAD, and that is important!) using curl, and then prints you just the headers. It discards the response body.
From this example, we can actually already see a few interesting things:
- The presence of etag indicates that Github calculates some kind of checksum for the index page
- The last-modified defines the last modification date of the file
- The via: 1.1 varnish indicates that GH Pages is using a caching proxy of its own - one called Varnish
- The cache-control says max-age=600 - we will get to that
- The vary: Accept-Encoding says that any caching that applies for us is content-encoding specific (so a separate cache will be used for Gzip-compressed responses and for plain responses)
- The etag is a “strong” ETag - it is just a string in quotes, without the \W at the end. This means that the representation we have fetched is specific to the content-encoding (or, rather - the combo of our request headers mentioned in the vary: header)
We are going to use this headercheck command to experiment with our local nano-CDN (which conveniently lives inside our Rack application).
Let’s complete our Rack web app with dependency definitions and “headercheck” it:
When we headercheck it, we get:
julik@kaasbook cora (use-stepper-motor-for-checking) $ headercheck http://localhost:3000 HTTP/1.1 200 OK Content-Type: text/plain Content-Length: 16Since we are interested in the headers here - and headers get cached as well - let’s add our “sentinel value” (the request ID) to the headers. If it does not change between our invocations of headercheck, we will know that the cached output is being served:
…and hit it with a couple of requests:
As designed, the x-request-id is different on every request - we have a completely dynamic web application. By default, it will not cache anything.
Now, let’s add rack-cache
Initially, nothing changes:
We do see, however, that rack-cache has recorded a cache miss - it tried to satisfy the request using cached data, but it could not - so it has let the request “fall through” to our app.
Now let’s do some cache controlling:
and check the headers:
Now, rack-cache starts caching our app’s output. We limit the age of the cache, and we can see that rack-cache has also computed the x-content-digest of our response, and that it stays the same. We can also see that the value of x-request-id header does not change. Our little “CDN in a box” is doing its job, and you now have what is an equivalent of the Rails page cache.
It is also broken. For example, imagine we want to to change the output of our little app:
We make a request, and…
Oops. We are in exactly the same dreadful spot the webmasters of the early 2000s would find themselves with a provider forcibly imposing a caching proxy on them. Your page is now firmly planted in the rack-cache, and won’t be excised and re-requested from your app until age turns 600 seconds or more. Bah.
What can we do about it?
Cache validity
When you talk about caching, a fairly common term is “cache invalidation”. It’s a weirdly chosen name, because there is nothing “invalid” about it. Rather, “validation” with caches means literally this: asking whether this cache is still fresh. The validation can be done using some sort of validation handle. For example, it can be a timestamp. We can let our app tell the caching proxy that the last modification time on our resource is X, and it then works as a validator. It will work in two ways:
- It will inform the caching proxy better as to what max-age refers to. A proxy may choose to revalidate with the origin (our app) even if max-age has not yet lapsed, or it may not. We’ll get there
- It allows the proxy to ask our app: “Have you modified this resource since X?” If our app responds with “Nope, still the same” (which is the 304 status) - the caching proxy can choose to extend its cached version storage for another max-age, or to just serve the cached version now
Since we are inside of a script and we know when we modify it - it is a file, dammit - we can grab the mtime of our script as our Last-Modified. It does need to be formatted using Time#httpdate:
Now, when we do a few GETs in sequence, the behavior of the caching proxy changes:
as does the output of our Rack app (the output is essentially the same as the x-rack-cache header):
So now we know rack-cache has used our validator and cached the initial version, and then served it from its cache. Neat. Let’s adjust our max-age a bit:
restart our app, and do a headercheck again - we changed the file, so it should pick up our changes and…
Nope. It happily continues serving our previous version until the age value exceeds the 600 seconds we have set. Pudu. What we need to do in this case is actually tell rack-cache that it should revalidate with the origin every time:
And - if we don’t want to wait another 5 minutes - we need to manually delete the cache:
This is an important lesson (and one of the reasons “developers don’t get CDNs”).
Lesson 1: Once you have told your caching proxies to cache something, it may be a nuisance to make them “forget” that cached data.
The initial request and the subsequent ones work fine:
and when we change the contents of our app and restart it, we see an actual invalidation followed by fresh:
Once max-age lapses, our app again gets hit and rack-cache knows that even though last-modified is the same, the output has likely changed and checks anyway.
Actually making the GET conditional
Now let’s actually use our validator. In our previous example, rack-cache would ask us for a new rendered response regardless once the max-age has lapsed. Now, let’s check for the If-Modified-Since header - and if it hasn’t changed since our file modification time - respond with a 304:
Now, a few requests (and having waited 30 seconds), we see that our validator is used and is actually still valid:
Note how, in the second request, while its age is reset to 0, the x-request-id header is exactly the same - and so is the content-digest! rack-cache resets the age, but it reuses the already cached response and does not call our application again.
A last-modified has a fundamental limitation. It is a low-resolution timestamp (only to a second). If our resource changes multiple times per second, the changes will not register - even though the contents of the resource may have changed. But more importantly - a last-modified does not provide us with a fundamental “webmastery” thing we would want at all times: read after write consistency
When you have a page, and you have some kind of CMS that renders that page, when you edit the page you absolutely want to see your changes immediately as you hit “Publish”. But if you do that frequently, your last-modified time stays the same. It is wall clock, not a vector clock.
This is why ETags are better. ETags are “freeform checksums” - values that only your application knows the semantics of. For example, you can have an ETag which says “version-1” and then gets incremented to “version-2”, “version-3” and so forth, effectively embedding a vector clock in the etag.
For reasons™ ETags must be in quotes. Let’s replace our modification time of the file with its checksum:
And after some time:
Notice something peculiar? Even though rack-cache is clearly reusing an already stored response, it prints store to the log - as if it “stores” the new response. Therefore: lesson 2.
Lesson 2: debug information from caching proxies can be very confusing.
Actually conditional GETs
Now let’s get to the meat of the matter: if you want true “read-after-write” consistency, here is what you need to do:
Yes, you remove max-age entirely. It means that every request will end up with our origin app and may cause some computation, but if it is cheap enough - we would just confirm the validator and respond with a 304.
If you edit the file and restart the server - the etag will change, and every subsequent revalidation will not have a matching ETag.
The subtle art of ETag divination
It is a bit of an art to come up with an ETag which accurately reflects the state of the resource displayed, but also of the application at large. For example, changes in your gems will likely lead to changes in rendered output. Changes in the objects you fetch into the view will, as well. And if your resource contains lists of things - the disappearance of an item from a list should also change the ETag.
Rails does provide something called cache_key which is available on all ActiveRecord models. The problem is that this cache_key can’t be used as the ETag, because it contains… the updated_at - and a truncated one at that. So it is actually just a Last-Modified supplemented with the model class name and ID:
And that’s… meh. It does help with identifying queries, luckily:
but is not really usable as ETag either. It is, after all, a digest of the query, but not of what the query produces. The way I approach it is heavy-handed but (in my opinion) very effective:
If you want to have multiple values contribute to the ETag, a good tool for this is Digest, used in its stateful form:
Like any other cache key, ETags require finesse. For example, imagine your ETag is for a page with a list of widgets, and the URL already specifies that we will filter down to ?state=shipped widgets. You may then have an ActiveRecord relation for those widgets:
But just the query is not enough. There are two possible cases where our resource (the page, API response etc.) will change:
- A widget is no longer shipped or a widget has become shipped since we computed our ETag
- A widget has been deleted entirely even though it was shipped
If we show all our widgets on the page, but may want to avoid rendering our view - and still produce a good ETag - we can do this:
This captures collection membership accurately.
Shortcutting ETag divination even more
I’ve lately enjoyed SQLite a great deal. One of the advantages with SQLite is that it does have a very good Last-Modified - the actual mtime of your entire database! Combined with the Git SHA (APP_REVESION) it allows for great ETags - while those won’t be “read-after-write” consistent, they will be quite stable.
This will give you a very reasonable ETag for the entire website.
Expiring your cache
It not only happens sometimes - it will happen to you when trying to make your site cacheable. Your caches will become stale, the site will get updates - but the visitors will not be getting the latest version, or - which is more likely - you will bodge the ETag derivation and miss something. It is a subtle art, and cache invalidation is one of the hard problems of computer science – no biggie.
With the strategy I’ve described above, you don’t need to reach for that big “Expire all the things” button in your CDN admin panel. All you need to is this:
or, since I bake the Git SHA into every Docker image I create:
This way, all of your caches will automatically invalidate on every deploy. How? Well…
- Since your Cache-Control does not specify max-age, the max-age is considered to be 0, so…
- …the caching proxies will always revalidate - since you said must-revalidate…
- …so they will always do a conditional GET, which…
- …will find a changed ETag and therefore refresh the cached resources.
As Tobi Lütke wrote many moons ago - you never want to manually expire.
Actually, I bet this is how the “instant expiry” on CloudFlare works too - they just bump a few bytes in your cache keys. But you can do that on your end as well!
Content-addressable resources and immutability
That is something you can get with ETags and systems like Git - but you can also do this if you store some kind of checksum with every model in your system (which is a bit tricky - but doable). Imagine every model you display has some kind of checksum_bytes column in the database, which stores a SHA1 digest of all the attributes on save. Kind of like this:
Then you can output the _attribute_checksum_bytes (converted to Base64 or other palatable string representation) as an ETag, or mix it into your ETag for invalidation:
Another measure you can use is the optimistic locking which conveniently provides a lock_version:
which would also enable easier conversion of an ActiveRecord into a stable cache key, and then - to an ETag.
Avoid immutability
The temptation would then also be to say (which is possible with some CDNs)
But check this out: imagine that Mallory became one of the patrons of your website. Mallory posts a piece of CSAM on your site, which your system duly checksums, and marks “immutable”. The piece of CSAM ends up in your CDN caches, and since it’s marked “forever” the CDN is not going to “ring back” and ask whether it is still in place. It will happily continue serving it to visitors, oblivious to the fact that you have actually received a court order commanding you to erase the resource within 12 hours - which you totally did.
Therefore: unless it concerns some very benign, bespoke resources that you can be absolutely positive will never ever need to be deleted - and most likely you have none of those - just don’t rely on that feature. Ever.
A 404 is ephemeral
There is a reason that 404 Not Found and 410 Gone are different status codes. Here is a hypothetical scenario:
- There is an address on your site called /exciting-announcement. It now returns a 404
- It was never used, but now the marketing department wants to use it to put out an announcement
- The announcement is there, and is happily present for a week or so, until…
- The legal department finds that the /exciting-announcement contains conflicting (and illegal) claims about a competitor, and thus has to be pulled immediately. It now returns a 404
- …but later on a way is found to make the announcement without disparaging the competitor. The URL is reinstated, but with different content.
All of those scenarios will be ruined if your 404 gets cached by a proxy. And it can get cached provided that you enable caches for non-2xx responses - this is usually not the default for caching proxies, but you may want to do it sometimes because one of the ways your web app can be brought to its knees is by bombarding it with requests to non-existent pages.
- Instating the page will not work as proxies that have, by accident, cached the 404 page
- Once the page is in place, the proxies will happily cache it even further…
- …so the request from legal to delete the page will not work…
- and so on
This is what the 404 code is for. If you are dealing with user-uploaded content, do yourself a favor and design a workflow with “permadeletion” where a resource would return a 410 status. And that response can have cache headers and be cached, because it is one of the very few truly immutable responses.
My advice? Do cache 404s but they should have a max-age, and maybe not even that. And do not play with immutable or content-addressable resources until you have a firm grasp of the consequences.
An interesting part of CDN offerings
CloudFront, for one, is an incredible product. I was very, very impressed at the depth and effectiveness of the package they offer. But take a look at this page:
One of the headliner features in CloudFlare used to be Cache Settings Override - a feature in CloudFlare’s Enterprise plan that allows you to ignore origin cache headers and set your own caching rules at the edge, effectively overriding whatever cache control headers your origin server sends.
Surprising, isn’t it? So a feature is explicitly in not honoring the Cache-Control headers. If you put this and Guillermo’s post together (even though Vercel and CloudFlare bump heads constantly) – you can figure out what is going on, but I will outline it just in case. There is a strong business use case which those CDN providers are addressing.
- You are a large organization called AcmeCorp. You have a number of web properties, which all have become slow. Very, very slow.
- At properties 3, 7 and 12 the freshly formed DevOps™ Agile Transformation Team formed just 3 quarters ago has investigated performance and has experimented with Cache-Control to make things better.
- Sadly, the team was disbanded, the tech lead on it fired and all the ops work of that team got outsourced to the secondary HQ in Saarbrücken. They are in the middle of their third reorg – and no end in sight, so there is nobody who owns the cache configuration.
- Meanwhile, you get a mandate to do something because even the COO notices that web property number 3 has become insufferably slow. Users complain, support queues are growing.
- …but due to having removed all technical leads from your part of the organization there is nobody to call Saarbrücken, get access to the repository, take ownership of the code setting the headers and fix them. Even where you are at, close to the COO.
Therefore, what do you do?
- One option is retooling your web property 3 into something very fancy with maximum Vercel lock in, and hoping that their magic tricks with streaming RSCs will magically make things go brr. You also need an entire team completely locked-in on all the newest React/Next.js practices.
- Put web property 3 behind CloudFlare, but then use the admin console to finally override those pesky bodged headers that those blokes in Saarbrücken just don’t want to fix (they are still gathering about whether a Doktor Fachinformatiker should be a requirement on their job descriptions or not), and be done with your day.
Of course you choose the latter.
The CDNs are having a ball solving the problem of organizations either being unable to get their Cache-Control right, or with organizations being dysfunctional enough that they can get them right - but utterly unable to drive the change through. With this perspective in mind, Guillermo’s post reads differently, doesn’t it?
This is another reason why knowing your way around Cache-Control is imperative. Don’t be the team in Saarbrücken and don’t let yourself be sucked into the suffocating Vercel embrace.
Balancing validations and read-after-write
It may be that you do want to have read-after-write only for people who actually can change the content on the page. That is actually harder to do than one would like, but very simple. All you need to do is bump the URL.
Yes, that easy. Your cache key in a caching proxy is always a combination of the full URL, the values of headers mentioned in Vary and the validators. To force a refresh for a URL, you may get by with just this:
For a site with public-facing pages I would actually not do that, though, and just speed up my revalidations enough that they become usable for every request. This is the next important lesson.
You want less config, not more
The header we have specified in the end (Cache-Control: must-revalidate) is actually implicitly expanded into this:
Cache-Control: public, max-age=0, no-cache, must-revalidate
And this is confusing as well, because no-cache actually means “do cache, but always revalidate first”.
Lesson 3: Cache control is intricate, and you should aim to do as little of it as you can.
This actually applies to more things in modern web development, and it is a pity that both the modern platform teams and modern frontend teams work in the opposite direction (more configs and more systems, as far as the eyes can see).
Conclusion
First, let’s reiterate the lessons one more time:
- Lesson 1: Once you have told your caching proxies to cache something, it may be a nuisance to make them “forget” that cached data.
- Lesson 2: debug information from caching proxies can be very confusing.
- Lesson 3: Cache control is intricate, and you should aim to do as little of it as you can.
And here is how you can benefit from Cache-Control, today:
- Use an embedded caching proxy inside your application to model the side-effects
- Once your embedded caching proxy is rock-solid and configured to your liking, set up a CDN
- Read your CDN’s documentation regarding Cache-Control. For example, here’s CloudFlare’s
- For a publicly-accessible “content” website, you likely want either public, must-revalidate with ETags or public, max-age=30, must-revalidate with ETags.
While Cache-Control is fairly well-defined, there are still peculiarities to how CDNs treat it. For example, stale-while-revalidate is very handy, but it only functions in combination with max-age being higher than 0 on CloudFlare. s-max-age can be useful too. But before you reach for these - get your basic, locally-testable Cache-Control rock-solid.
Because I want your website to be fast, and I am not selling you a CDN. Serious.
.png)


