Table of Contents
Skip to article content- I'm the CEO of htmx
- If Flash and Visual FoxPro are hypermedia, then everything is
- If your custom non-standard XML DSL is hypermedia, then everything is
- It's in the client, baby
- If your custom non-standard <insert name> is hypermedia, then my <insert other name> is
- Contortionists belong in Cirque du Soleil
I'm the CEO of htmx
Ever since htmx burst on the web scene with their marketing and online interactions that some find funny and some find obnoxious, I had a very peculiar beef with them.
Hi, I am Dmitrii, and I am the CEO of htmx. Let's talk about hypermedia. It will be long-winded, meandering, losing its threads, ranty, and perhaps even incoherent. For this I apologise.
I'll start with a few quotes:
A Hypermedia Driven Application interacts with the server in terms of hypermedia (i.e. HTML) rather than a non-hypermedia format (e.g. JSON)
Hypermedia Driven ApplicationsJSON isn’t a hypermedia because it doesn’t have hypermedia controls.
...
The deeper problem ... is that, for this JSON response to participate properly in a hypermedia system, the client that consumes the JSON needs to also satisfy the constraints that the RESTful architectural style places on the entire system.
Hypermedia Clientsyes, I believe HTML is the Only True Hypermvedia
you can tell this by the inclusion of HXML/hyperview in my book hypermedia.systems & by the 1st sentence in my ACM paper:
"A defining characteristic of hypermedia systems is the presence of hypermedia controls"
So yeah. I have an issue with all that.
htmx's HATEOAS essay doesn't define hypermedia, but links to this wikipedia page for an explanation and definition: https://en.wikipedia.org/wiki/Hypermedia
Hypermedia, an extension of hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks.
Hypermedia is a type of multimedia that features interactive elements, such as hypertext, buttons, or interactive images and videos, allowing users to navigate and engage with content in a non-linear manner. ... Multimedia development software such as Adobe Flash, Adobe Director, Macromedia Authorware, and MatchWare Mediator may be used to create stand-alone hypermedia applications, with emphasis on entertainment content. Some database software, such as Visual FoxPro and FileMaker Developer, may be used to develop stand-alone hypermedia applications, with emphasis on educational and business content management.
The whole Wikipedia article already argues my point for me, but let's go on anyway.
In 2025, if you go ahead and click on a link to an .swf file, the browser will just download it, and the OS will ask which program you want to use to open it. This was the case even in the Flash heyday, and many a curse was uttered throughout the world when you couldn't play something on Kongregate.
Speaking of Kongregate... It's still around and offers to play games written in Unity. Is Unity hypermedia, too?
Besides, buttons in Flash don't do anything. You have to write some ActionScript and attach it to a button's click event. Sounds familiar.
Let's put Flash aside for a second. After all, htmx often talks about formats, and Flash format clearly has things to describe buttons, and forms, and links, and a plethora of other things (you can see them in the spec). What about formats that don't have buttons and links?
Well, when I open what is basically a plain text file in a favourite code editor like IntelliJ Idea, or a less favourite editor like VS Code, I get plenty of actions and navigation. I can Cmd-click/Ctrl-click basically everything in the file and navigate around easily, non-linearly, with back and forward navigation and retaining scroll position.
Does the format have buttons and links and other things? Well, no. Does the client have a very standardised way of navigating and linking entities in this format? Yes. Does it need a specialised client to do that? Yes, just like Flash. And there are more clients capable of doing that than there are Flash clients. And, dare I say, a magnitude or more clients than there are htmx clients.
The lines get significantly more blurry after this.
On MacOS you can select and follow links from images. Directly in apps, including the browser. Here's me right clicking on a text in an image on a web page:
Does the format have "hypermedia controls"? Well, no. Do you need a specialised client to do this, like Flash? No, not really, it's a feature of the OS. And yes, really, because it's a feature not available in other OSes, or in other versions of the OS... Just like with Flash. And FoxPro. And situations where htmx is not available. And...
So, what we have here is a combination of things where hypermedia is:
- both the format (HTML) and a commonly available client (the browser)
- both the format (SWF) and a very specialised client (Flash Player)
- no format (plain text) and rather commonly available clients (IDEs, code editors)
- no format (PNG, JPEG etc.) and a very specialised client (OS-level text recognition in images)
And all of them provide the same functionality attributed to hypermedia. And in htmx's case attributed to hypermedia format only.
If your custom non-standard XML DSL is hypermedia, then everything is
The thing with htmx is that they say "HTML is the only true hypermedia", and then contort every definition and text to conform to HTML. Since this is the wrong approach, they end up contradicting themselves every which way.
We'll get to custom XML in second, but let's start with htmx itself.
htmx is a JS-library that adds arbitary custom non-standard functionality to HTML, and requires custom arbitrary HTTP headers on requests and responses to work with server-side code. Oh, and it has its own custom arbitrary DSL for a bunch of functionality on top of that.
The actions in htmx, especially those that rely on server operations, go more or less like this:
- make a request with a bunch of special request headers like HX-Boosted or HX-Target
- see if response contains specific custom headers
- parse the response into a data object
-
walk that data object looking for specific keys
- for each special key do key-specific things. E.g. preserve original elements
- for each defined event trigger event
-
insert data object into the DOM
- for all inserted elements set/restore classes and styles, and trigger events
Of course, as soon as you remove the htmx library from the equation (for example by disabling Javascript on the page) none of this works.
If you do the exact same with JSON, however, this somehow stops being hypermedia: https://htmx.org/essays/hypermedia-clients/
...occasionally, a smart and experienced web developer will reply with something along these lines:
OK, mr. REST-y pants, how about this JSON?
{ "account": { "account_number": 12345, "balance": { "currency": "usd", "value": 50.00 }, "status": "open", "links": { "deposits": "/accounts/12345/deposits", "withdrawals": "/accounts/12345/withdrawals", "transfers": "/accounts/12345/transfers", "close-requests": "/accounts/12345/close-requests" } } }There, now there are hypermedia controls in this response (normal humans call them links, btw) so this JSON is a hypermedia.
So this JSON API is now RESTful. Feel better?
😑
One must concede that, at least at a high-level, our online adversary has something of a talking point here: these do appear to be hypermedia controls, and they are, in fact, in a JSON response. So, couldn’t you call this JSON response RESTful?
Being obstinate by nature, we still wouldn’t be willing to concede the immediate point without a good ackchyually or two:
- First, these links hold no information about what HTTP method to use to access them
- Secondly, these links aren’t a native part of JSON the way that, for example, anchor and form tags are with HTML
- Third, there is a lot of missing information about the hypermedia interactions at each end point (e.g. what data needs to go up with the request.)
And so on: the sorts of pedantic nit-picking that makes technical flame wars about REST on the internet such a special joy.
However, there is a deeper ackchyually here, and one that doesn’t involve the JSON API itself, but rather the other side of the wire: the client that receives the JSON.
... for this JSON response to participate properly in a hypermedia system, the client that consumes the JSON needs to also satisfy the constraints that the RESTful architectural style places on the entire system.
This is from a library whose code is this:
<!-- Custom attributes placing constraints on the client that consumes this --> <div hx-target="this" hx-swap="outerHTML"> <!-- Links/Buttons don't have info which HTTP methods to use, so we use custom attributes --> ... <button hx-get="/contact/1/edit">Edit</button> </div> <!-- Custom attributes to tell the JS library which methods to use since links have no info --> <form hx-put="/contact/1" hx-target="this" hx-swap="outerHTML"> </form> <!-- Custom attributes and custom DSL that the browser has no knowledge of --> <input name="q" hx-get="/search" hx-trigger="input changed delay:1s" hx-target="#search-results"/>So, to very liberally paraphrase htmx, as long as it looks like HTML, it doesn't matter how custom or non-standard your data is, and it doesn't matter if you need 14kb of minified and gzipped Javascript to process it.
Well, it turns out that this is not that liberal of a paraphrasing. Let me introduce you to Hyperview from the authors of htmx:
On the web, pages are rendered in a browser by fetching HTML content from a server. With Hyperview, screens are rendered in your mobile app by fetching Hyperview XML (HXML) content from a server. HXML's design reflects the UI and interaction patterns of today's mobile interfaces
In their book Hypermedia Systems Hyperview is described as
Hyperview is an open-source hypermedia system that provides:
- A hypermedia format for defining mobile apps called HXML
- A hypermedia client for HXML that works on iOS and Android
- Extension points in HXML and the client to customize the framework for a given app
Oh, it gets better:
Hyperview provides an open-source HXML client library written in React Native. With a little bit of configuration and a few steps on the command line, this library compiles into native app binaries for iOS or Android. Users install the app on their device via an app store. On launch, the app makes an HTTP request to the configured URL, and renders the HXML response as the first screen.
Yes. It's a custom non-standard subset of XML that requires a custom client to process and render on screen using underlying platform's established primitives.
As long as it's not a non standard JSON that requires a custom client to process and render on screen, I guess.
It's in the client, baby
HTML the format has things like links, buttons and forms. It requires a specific client to render all those with some predefined behaviours. Otherwise HTML is just plain text. To do anything beyond those predefined behaviours you need to create some custom extensions and a client library in Javascript. That is, it's the Javascript library that is the client, and the browser is just the rendering target.
SWF the format has things like links, buttons, forms and other stuff. It requires a client to render all those. Moreover, to make anything interactive you have to write specific custom scripts for the objects you want the user to interact with. A button on its own does nothing. And without the client it's just a binary blob.
Structured plain text has nothing like links, buttons, forms or any other stuff. It doesn't prevent clients from rendering those files with full navigation, links, buttons, and actions.
htmx is a custom non-standard extension to HTML with a single client capable of working with it properly: the htmx javascript library. Without the client htmx sites are fully broken.
HXML is a custom non-standard XML DSL that uses React Native as its rendering target. The client is Hyperview RN Client. Without it it's just plain text or invalid XML nothing knows how to process.
Let's go back to the wikipedia description (emphasis mine):
Hypermedia, an extension of hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks.
Hypermedia is a type of multimedia that features interactive elements, such as hypertext, buttons, or interactive images and videos, allowing users to navigate and engage with content in a non-linear manner.
"Medium of information" and "type of multimedia" doesn't really say much about formats. Multimedia is consumed by client software, and any interaction is really provided by the client.
For example PDF has forms, and buttons, and links. For the longest time ever you couldn't do anything with the fields in most clients other than Acrobat Reader.
If you disable Javascript on an htmx website, then htmx client will be disabled, and there will be no functionality provided by its "hypermedia format".
Hypermedia is a function of the client, not of the format.
So, is this hypermedia:
{ "account": { "account_number": 12345, "balance": { "currency": "usd", "value": 50.00 }, "status": "open", "links": { "deposits": "/accounts/12345/deposits", "withdrawals": "/accounts/12345/withdrawals", "transfers": "/accounts/12345/transfers", "close-requests": "/accounts/12345/close-requests" } } }Yes, yes it is. As long as you have a client which understands this format, and knows what to do with these links.
Moreover, there are dozens of standardised ways to describe resources and operations on them through Link Relation Registry and RFC 8288 alone. Microformats add many, many, many more through various standards bodies and organisations. Or you could go all in with JSON Schema like JSON Hypertext Application Language and JSON Hyper-Schema, or even roll your own.
In the end you will have an arguably much better hypermedia format and clients than an ad-hoc unspecified custom DSL (with a few other DSLs inside this DSL). Oh, you will also get a better understanding and implementation of HATEOAS, too.
Contortionists belong in Cirque du Soleil
No, really, look at this: https://www.youtube.com/watch?v=m7mz6VHLJSc
I guess, in the end, the main issue I have with htmx is that they misappropriate works by Ted Nelson (notably, his introduction of hypertext) and Roy Fielding (notably, his dissertation that introduced REST but also other texts).
Once you've already arrived at a conclusion ("HTML is the only true hypermedia"), you have to contort everything else to support this conclusion.
For example, in How Did REST Come To Mean The Opposite of REST? htmx authors start with a link to Roy Fielding's article, REST APIs must be hypertext-driven. It doesn't matter what the article actually says, or what Roy Fielding himself says in the comments under the article. As long as you have a conclusion to reach, nothing matters. Here's the conclusion:
The crucial difference between these two responses, and why the HTML response is RESTful, but the JSON response is not, is this:
The HTML response is entirely self-describing.
A proper hypermedia client that receives this response does not know what a bank account is, what a balance is, etc. It simply knows how to render a hypermedia, HTML.
Here's what Roy Fielding himself says in the comments to his own article:
Hypermedia is just an expansion on what text means to include temporal anchors within a media stream; most researchers have dropped the distinction.
Hypertext does not need to be HTML on a browser. Machines can follow links when they understand the data format and relationship types.
Ted Nelson never talked about HTML in his work (because it was almost 30 years before HTML). Roy Fielding rarely talks about HTML (mostly as an example of a media type that a client and server may agree on).
And yet, htmx authors constantly present things like this:
I have pushed the idea of hypermedia controls being a necessary component of the idea of hypermedia
Note that Ted Nelson did not explicitly mention this idea so you can argue that I'm pushing a novel definition
This is not a novel definition. It's fitting the facts to the conclusion. Especially when you yourself require 14 kilobytes of minified Javascript to render the "self-describing hypermedia" in the browser.
Hypermedia, as Roy Fielding himself says, is the property of a machine talking to another machine using a media type both understand. The media type aka format itself doesn't matter as long as both machines understand it. Even presentation of controls to the user is a secondary concern and can be performed by the machine as it sees fit.
Because, once you've unwrapped the contortionist back into their human shape, hypermedia is a property of the client.
.png)


