Someone recently asked the greybeards over at Hacker News,
Why did COM/SOAP/other protocols fail?
With the recent buzz around MCP, it made me think about what I’ve read about other unifying protocol attempts in the past. Why did these 2000s era interoperability protocols fail, and what does MCP do different? Was it a matter of security issues in a newly networked world? A matter of bad design? A matter of being too calcified? I would love to hear from those who were around that time.
I have the dubious distinction of being one of very few people in the world who has actually worked on an implementation of DCOM. It’s one of the many bits of legacy tech cluttering up my brain alongside Object Pascal, Perl, how to write SoundBlaster drivers and other stuff I’ll hopefully one day finally forget.
Dear reader, even assuming you are an experienced programmer you have most likely never heard of DCOM, and might not even have heard of SOAP. This brief oral history is for your benefit.
What are we talking about again?
The questioner on Hacker News conflated (D)COM and SOAP together. But these are very different protocols from different eras, and which became legacy tech for reasons that are also different (yet related). So we have to treat them separately.
We can split the history of network protocols into three rough eras:
- The 1980s. Birth of the internet. The operating system gives you a byte stream or maybe a message pipe. Everything else is left to the developer. Both binary and line-based protocols are popular. Every protocol reinvents basics like framing (how to tell where a message begins and ends), sequencing (who speaks and when), authentication and every other detail as well. Nothing is encrypted.
- The 1990s. Object oriented programming was a new concept and becoming wildly popular. Some tech companies observed that protocol development involved a lot of repetitive work. The solution seemed obvious: make object oriented programming function over the network! How hard could it possibly be?
In 1989 eleven companies formed the Object Management Group, which defined a new specification called CORBA. In 1995 Microsoft developed its own competitor to CORBA called DCOM. Sun both developed CORBA and also its own Java-specific equivalent called RMI. These object-oriented network protocols achieved some brief success and DCOM lives on in factory automation, but they died out in the early 2000s and ceased to be used for new development. Actually they were never that popular to begin with. Meanwhile the web is taking off and HTTP — a classic 80s era line oriented design— becomes the first network protocol to gain encryption thanks to Netscape’s efforts on SSL. - The 2000s. OOP is out, Web is in. An ecosystem is flourishing around HTTP that solves many problems the OOP protocols ignored. In 1998 Dave Winer introduces XML-RPC which uses web-style protocols to implement simple stateless function calls. Once again, tech companies notice that protocol development is awkward and time consuming, and once again they form a standards committee to try and “improve” XML-RPC. The result is SOAP, which — once again — achieves some minor success before being rapidly supplanted by …
- The 2010s. Ad-hoc protocols using JSON over HTTP. What additional abstraction exists here is simply adding on this basis e.g. OpenAPI, JSON schema.
Why has it proven so hard for the industry to level-up over simple text based protocols?
OOP RPC review
Very few people still in the industry have used 1990s-era OOP RPC protocols. They all looked pretty similar, so let’s quickly review how they worked.
- They were fully stateful. A server was a process that “exported” classes via some mechanism. A client “activated” an instance of a class via some API, receiving a proxy object sometimes called a stub. Stubs “marshalled” (serialized) function calls to a binary network protocol which would then be “unmarshalled” by a “skeleton” that converted it back to a function call on the right instance of the server object. Some classes were singletons and others allowed clients to create instances.
- There is no client/server distinction, because, being OOP, they not only allowed servers to send pointers to clients but also for clients to allocate objects and pass pointers back to the server. This is how “server push” (callbacks) worked. A graph of objects is built up that spans both sides of a network connection.
- The network protocols were not designed to be implemented directly. They were details hidden behind libraries.
- They were strongly typed. DCOM had the best support for dynamically typed languages thanks to Microsoft’s BASIC heritage, but the basic assumption of all these systems is that there is some sort of shared multi-language type system in place.
On top of this basic concept DCOM, CORBA and RMI built up layers of features. Maybe you want security to control who can invoke functions on your objects? They have a spec for that. How do you express what functions are available? They have a standard interface definition language. What if you want a database transaction to be coordinated across different servers? They do that too.
The Design By Committee Era
It’s important to stress that everything about these systems would seem completely alien to a modern developer, both technically and culturally.
For example, you might be thinking the above sounds pretty neat and wonder whether it might be fun to implement those protocols yourself. The answer is no because OOP RPC protocols weren’t designed to be directly used by normal developers. In fact the DCOM wire protocol was a trade secret of Microsoft for many years. Using them meant relying on vendor API implementations. CORBA started out life with no network protocol at all, leaving it all up to the vendors. Eventually it gained a standardized protocol (IIOP) but it was designed as a way to connect different implementations written by BigTech together. The assumption was only a few people would ever implement IIOP and everyone else would access it via libraries, so being spec compliant was very hard. It was closer to modern HTML than HTTP, in the sense that whilst you can theoretically implement it yourself, it would take a lot of effort.
This happened because these systems were all designed to sell “middleware” (libraries, operating systems or languages), and so the designers solved lots of problems they imagined people might have, without actually having those problems themselves. This shows up in a lot of ways and it’s the root cause for HTTP+JSON outcompeting them. You don’t see this approach much anymore in the tech industry because we learned from this experience! Standards nowadays tend to be smaller and adapted from something a tech company developed to meet a real, concrete need they had.
SOAP wasn’t entirely like this, but it arrived at the end of this era and inherited the mindset. SOAP was complex committee-ware in which the authors assumed you’d use an implementation from your language’s standard library, written for you by a big tech company. Maybe those languages and stdlibs would be free-as-in-beer (e.g. Java) but there’d still be a big team backing it.
The problem with this is obvious, but only in hindsight:
- Only Java and .NET were big enough to get really complete implementations of the specs.
- Sun and Microsoft were competitors so neither did proper interop testing against the other’s implementation because in their vision everyone would use their stack for everything. So there were lots of interop bugs in SOAP.
- In the early 2000s there was an explosion of interest in dynamic scripting languages like Python/Ruby. These languages had standard libraries cobbled together out of little bits of code written by volunteers as they worked, which is a bad fit for any tech that has specs running to hundreds of pages with hundreds more being added every year by BigTech employees.
OOP RPC technical problems
OOP RPC wasn’t just sunk by its complexity. There were numerous design failures too.
A standard critique of RPC is that a remote function call isn’t like a local function call because of latency and network errors. But this critique is off base. It wasn’t a big problem in reality.
COM was originally built as a local RPC system. Just because there’s no network cable involved doesn’t mean local RPCs are guaranteed to work. The local server can crash, hang, time out, be quit by the user or not be installed properly when you try to connect to it. DCOM can express all these things because in COM every function call returns an HRESULT, a semi-standardized error code. The actual result of a function call is returned in an “out parameter”. In other words, remote interfaces weren’t expected to be identical to local interfaces. Bindings can hide the distinction by throwing exceptions if they want, but if not nobody was surprised because returning an error code for every function is idiomatic in Windows programming.
By far the biggest real problem was the failure to predict the success or scale of the internet. The designers of OOP RPC were working in the early 1990s and thought on LAN scale. An application, they imagined, is a soup of objects calling each other in complex ways over a small flat network. There’s an ambient assumption at every point that networks are tiny, simple and managed by an IT department.
This shows up in the following problems.
No way to deploy UI logic. OOP RPC protocols had no idea how humans would actually use the objects they exported. That question was viewed as totally out of scope!
Microsoft’s original solution for network deployment of GUI logic was copying an EXE to a shared Windows drive, a technology that only worked over a LAN.
HTTP had an answer that worked over the internet: you ship a UI expressed in HTML, a language for expressing GUIs that create more HTTP requests. Today products like my own companies’ Hydraulic Conveyor make deploying desktop apps easy, but back when this stuff was playing out distributing GUI logic to clients was capital H hard. Microsoft and Apple were children of the 80s and thought of shipping software in the literal sense of putting things on container ships. Their whole culture was oriented around the logistics of manufacturing compact discs and delivering them to retail outlets on the street.
No load balancers or proxies. Load balancing DCOM is impossible, that’s according to a company that sells load balancers.
Activating a COM object over the internet means passing a domain name to the Win32 CoCreateInstanceEx function, which expects the IP address to mean one specific Windows computer. The entire concept assumes that connecting to the same IP address a second time will reach the same physical computer. What if you want a domain name to represent an entire datacenter, or global network of datacenters? There was no support for this; the designers simply never imagined it.
Nor is it even clear how you’d implement such proxies! Everything is a pointer to a piece of memory and a thread somewhere, and the protocols are so complex they don’t have any ecosystem of tools like nginx around them. With DCOM you couldn’t even write an nginx equivalent due to the highly complex and closed protocol.
No firewall support. Because the designers assumed network = LAN they had no notion that network connections might be one way only. Passing an object from client to server so you can receive callbacks is idiomatic in OOP RPC, but the server will try to connect to the client to deliver the callback. It won’t reuse the existing TCP connection established by the client. This is a natural consequence of not having any “server” concept in the design to begin with. Thus if the initiator is behind a firewall, it simply breaks.
No way to stop DoS attacks. Modern web apps are protected by many layers of defenses capable of stopping even very sophisticated DoS attacks. It’s an essential part of surviving as an online business. Because OOP RPC was imagined for the LAN, there is no way to make such a service DoS proof. Objects live in RAM so all a client has to do is connect once and allocate objects until the server runs out of memory. And because there’s no way to deploy UI logic you can’t solve attacks by quickly deploying CAPTCHAs or making other service modifications.
No good way to authenticate users. OOP RPC protocols did integrate authentication, but because they were designed by operating system vendors for the LAN their core concept was to propagate your local operating system username to the server cryptographically (e.g. using Kerberos or Active Directory). So again there’s an ambient assumption of small scale and carefully managed networks.
No versioning support. If you wanted to add methods to an object in COM you had to create a new interface (objects can export multiple interfaces). But there was no way to change an existing interface nor any of the types it used, so COM based systems are full of interfaces with names like IFooBar3 . This was a poor fit for the rapidly evolving world of online business. Having very restricted ability to evolve your API in a backwards compatible way just wasn’t workable in a world where apps were expected to remain online 24/7.
No caching support. Etcera etcetera. You’re getting the picture.
There were more problems than these, but I left the best till last: many companies don’t want to expose a general API to their service. This can be for branding, business or strategy reasons. HTTP was never designed for APIs to begin with so it gives the client nearly no power out of the box. You can invoke private application endpoints yourself but don’t be surprised if you get redirected to a CAPTCHA page — after all, HTTP never promised you’d get actual data back from a request!
MCP
How does the Model Context Protocol relate to all of this?
Well, arguably it doesn’t. MCP is the latest in a long line of works that adds thin extensions on top of the HTTP layer cake. In a hypothetical world where OOP RPC won you could have had an MCP on top of it too, and it might even have had some advantages. But we don’t live in that world.
Conclusion
The HTTP ecosystem provides sooooo many features that we take for granted. Could OOP RPC have ever worked?
I suspect the answer is yes, it could, but it would have looked very different to DCOM or CORBA. Many of the problems described above could have been avoided with more insightful protocol design. The core problem of stateful pointers is also not insurmountable: nothing says an object has to be loaded into RAM at all times, nor that it has to be tied to a physical IP address, nor that remote clients have to be able to create objects whenever they want.
An OOP RPC protocol that worked would have looked more like a standard way to encode references to “objects” stored in a database, with a way to reflect available methods and properties, pass references into and out of method calls, and navigate object hierarchies in a pipelined manner. This sort of tech might well still have failed, as most apps don’t actually want to give any flexibility to the clients and the whole point of exposing general object-oriented APIs is to give flexibility. But nonetheless, we’re slowly iterating towards this kind of vision with things like GraphQL, providing separate data access APIs and now also with MCP.