2,000x faster route propagation by rewriting our Traefik gateway in Rust

1 day ago 4

Rivet is an open-source, self-hostable serverless platform that supports Rivet Functions, Rivet Actors (open-source Cloudflare Durable Objects), and Rivet Containers. We've rewritten our Traefik-based gateway in Rust — now named Rivet Guard — to support the hyper-specific needs of building a serverless platform.

Rivet Guard is our gateway service that handles all incoming traffic to the Rivet platform. Its core responsibilities include:

Rivet Function Routing: Routing the appropriate paths + hostnames to the correct Rivet Function
Rivet Actors & Rivet Containers Routing: Route incoming requests to the correct Rivet Actor or Rivet Container
Rivet API: Route API endpoints to the Rivet API server

Traefik is a widely used reverse proxy that makes it incredibly easy to set up load balancing & routing in a variety of environments.

To ship the original version of Rivet, we chose Traefik because of its ease of use for configuring dynamic routes. We always knew this was a temporary solution until we could write a more mature gateway ourselves.

We took advantage of the Traefik HTTP provider to dynamically configure routes for the MVP of Rivet. Our setup consisted of 3 components:

Traefik: Acts as a gateway for proxying requests from clients to the appropriate Rivet Function, Actor, Container, or API endpoint
HTTP provider: An HTTP API that Traefik polls every 500 ms to pull the available routes
- Responds with a "state of the world" that includes all of the routes, middleware, and services.
- Includes an in-process memory cache of available routes
NATS Cache Pub/Sub Topic for purging: NATS provides a topic to notify the HTTP API that the routes have updated and the in-memory cache needs to be purged

Part 1: Slow route propagation

It's incredibly important that Rivet Functions, Actors, and Containers are immediately available when provisioned. We put a lot of work into our infrastructure to make this happen, but Traefik has always been the one piece that slowed down how fast we can make resources available.

Under the hood, we had Traefik configured to poll at a 500 ms interval and providersThrottleDuration configured to 0.025s. If we tried to set the interval any faster, it caused internal backpressure. Despite being configured as such, Traefik does not expose routes within 500 ms in practice. Instead, we found it reliably taking between 1-2 seconds. Therefore, we had to add an artificial timeout of 2 seconds just to wait for our routes to propagate reliably within that time window.

All of the work we had put into making our backend & infrastructure provision resources blazing fast was going to waste with a measly timeout.

Part 2: Large HTTP provider responses

Every single route available on Traefik required a configuration entry for:

A route entry
A service entry
Custom middleware for in-flight request limits (configurable by developers)

These routes come out to a minimum 1.3 KB of JSON per route — assuming the developer has not configured additional middleware.

At around 3 MB JSON responses (just over 100 routes, or 800 routes spread across our 8 regions), Traefik's latency started to increase noticeably.

Our load tests had previously shown that Traefik completely stops working once your HTTP provider JSON config hits ~14 MB, so we knew it was time to upgrade the Rivet gateway or bad things would happen.

When designing Rivet Guard, we wanted it to support:

Fast route propagation: Routes should be available immediately without having to wait for them to propagate
Infinite routes: Vital for the amount of Rivet Actors that developers are creating
Stateless: No complex route materialization or config generation

Rivet Guard is built differently than most other proxies: it's built as a library that we can internally plug in our own custom routing handlers. Its routes are lazily resolved & cached instead of being kept in memory.

When a request is made to Rivet Guard, it will call a user-provided function called routing_fn that will return the endpoint to connect to lazily. This endpoint will be cached for future requests.

In contrast to the previous Traefik configuration, this provides the following benefits:

Simpler architecture: Simplifying from 3 components (Traefik, API, NATS) → 1 component (Rivet Guard)
Completely stateless: Routes are fetched & cached on demand
Scales: Infinite number of routes
Flexible: We can parse & generate routes in any way that you can with code, e.g. regex, dynamic route lookups, etc.
Easy to debug: We can use our existing monitoring stack
Configuration-free: "Configuring" Rivet Guard is just writing code without the sharp edges that configuring things like Traefik routes has

When configured with a routing_fn that can resolve routes from an external datasource, the new gateway architecture for Rivet looks like:

Configuring Rivet Guard is done by defining three functions:

routing_fn: For a given hostname and path, return the target endpoints to route to
middleware_fn: For a given endpoint, return the middleware (i.e. rate limiting, max in-flight, retries, timeout)
cert_resolver_fn: For a given hostname, return the TLS certificate to use

Configuring Rivet Guard looks like this:

Rivet Guard is responsible for handling everything from the TCP connection to the actual proxying. Accepting a request handles the following steps:

Accept TCP connection: Incoming TCP connection accepted on configured port
TLS Handshake: TLS handshake using the cert from cert_resolver_fn
HTTP Request Parsing: Request parsed by hyper
Route Resolution: Check route cache or call routing_fn
Middleware Resolution: Check cache or call middleware_fn
Rate & in-flight limiting: Apply rate limits using middleware config
Request Proxying: Transform and forward request to the target service with retry logic from middleware config (more on this later)
- Requires extra logic for forwarding WebSockets powered by tokio-tungstenite
Response Handling: Receive and forward response from target service to the client

Most of this is pretty standard across reverse proxies, except for steps 4 and 7. We'll dive into how these work in practice.

Now for the good part: how does rewriting this critical piece of software make it so much faster?

Instead of waiting 2 seconds for Traefik to poll our configuration, we can immediately look up where to route a request with routing_fn — even if the route isn't ready to serve requests yet (more on that later).

Under the hood, almost everything in Rivet is powered by FoundationDB. Route lookups are a single key get operation, which takes between 0.1-1 ms. That means that our worst-case routing latency is 1 millisecond — which is 2,000x faster than our previous 2 second latency.

After the first request to a route, all following requests are cached, meaning that the 1 millisecond overhead is a one-off scenario & future requests take a negligible amount of time to resolve. More on this later.

Most infrastructure like Kubernetes takes a long time to start containers because you have to wait for many steps (download image, start container, wait for server ready, wait for health check).

This means that whenever you need to start a resource, you usually need to wait a long time for it to come up before you have a URL you can access it from.

On the other hand, Rivet returns a URL to access a resource before the resource is ready. This way, the resource has time to start up while the connection URL is being sent back to the client over WAN (which is bottlenecked by the speed of light) and while the caller's backend processes the request.

This means that instead of waiting for:

Container Request
Container Starting
Route Propagating
Your Backend
Send URL To Client
Client Connects To URL

Rivet Guard can do:

Container Request
In Parallel:
- Rivet Infrastructure
  1. Container Starting
- Your Infrastructure
  1. Your Backend
  2. Send URL To Client
  3. Client Connects To URL

What if I don't send URL to the client over WAN?

This sequence assumes that a container URL has to be sent back to the client over WAN, which adds a full round trip. This is a common pattern for container workloads like game servers, desktop automation, and code sandboxes. Even if you don't need to share the URL to a container publicly, this still significantly cuts latency given that your backend needs time to process the response in parallel with the container starting.

However, if the resource is not ready by the time the client makes the request, normally a reverse proxy would fail with something like a Bad Gateway error.

Instead, Rivet Guard is able to intelligently buffer requests until the resource is ready then delivers the requests.

This has the added benefit of gracefully handling resource crashes & restarts. Say a resource becomes unresponsive or restarts: this mechanism can intelligently buffer and send requests once the service accepts the underlying TCP connection. This is incredibly important for Rivet Actors and Rivet Containers where — unlike traditional edge functions — there are frequently singletons (only one instance of a single resource) which need to be able to be gracefully restarted.

This pattern is well proven in systems like FoundationDB where instead of doing rolling deploys for upgrades, you simply restart the database with the new version. The clients are able to intelligently replay requests so there is no impact to the client. Read more about FoundationDB's upgrade process to see prior art to this pattern.

The key part of this architecture is that the routing function is called lazily and cached for speedy responses.

The caching is done by returning the host & path prefix from the routing function. For example, if the user requests example.com/foo/bar, but we know that the route prefix is /foo, we'll cache that route for all future requests to example.com/foo/* (e.g. including example.com/foo/baz).

The value that gets cached is the targets & the middleware (in separate caches with different timeouts).

If we cache routes and the target changes (e.g. resource upgrades or migrates to a different machine), how do we know when we need to update the target list?

We could implement a cache invalidation system by sending a message to all Rivet Guard instances (similar to what we did for Traefik with NATS), but this requires adding another moving part to a system that we're trying to keep as simple as possible. (We'll still implement later this to minimize p99 when restarting resources & pick up new targets faster.)

Instead, we can implement a much simpler alternative: if we can't route to a target, invalidate the cache and try again:

Make request to target 1.2.3.4:5678
1.2.3.4:5678 returns connection refused
Invalidate cache & call routing function again
Proceed as usual

A keen eye may notice — what if Container B gets scheduled on 1.2.3.4:5678 before our cache can be invalidated with a connection refused to Container A? Wouldn't that incorrectly route requests meant for Container A to Container B incorrectly? Yep — that's why we reserve the port after it's been released for a duration longer than our cache expiration, so this never happens.

Why Rust, and Not Go?

Now for the fun part — provoking the age-old Crustaceans-Gophers Wars.

I want to be clear that this is not your usual "Rewrite It In Rust" post — I firmly believe Go would have been a great choice for this as well.

Either would have been well suited for the job, but these are some factors that made us comfortable choosing Rust instead of Go, despite Go's extensive history with Traefik, Istio, and other networking infrastructure tools:

Extensive internal ecosystem at Rivet: Everything we have is written in Rust, so we could use all of the monitoring & database tooling we've already battle tested
Building around Hyper & Tungstenite: Hyper is powering many production systems. As a startup, we don't have time to futz with building "correct" lower-level networking implementations. We also want to support QUIC & upcoming transports, which Hyper provides for free.
Memory safety: The surface area of bugs we need to worry about with Rust is much smaller than Go — which I can't overstate the importance of for something as critical as our gateway that touches every request that reaches Rivet
No GC pauses: Go advertises sub-millisecond GC pauses, but I'm not an expert enough in Go to understand the tradeoffs here. The real experience is always different than the advertised — and Rust has fewer moving parts that make it much easier to conceptualize where any potential performance issues might be coming from.

Our use case is not totally unique. Most large companies need complicated dynamic route discovery, so there are mature tools in the ecosystem for this.

Envoy is used heavily in services like Consul for dynamic configuration. Envoy works via using a "discovery service" (xDS) as described in their dynamic configuration documentation.

We could've exposed a customized Endpoint Discovery Service (EDS) and Route Discovery Service (RDS) over gRPC or REST to support this.

Why we went with our own custom solution:

Our logic for optimistic retries, caching, and cache invalidation is incredibly specialized to our architecture
We can easily add more features deeply integrated with the gateway, such as analytics, etc
Gateways are always on the bleeding edge of networking standards — if we want to support WebTransport or other new technologies, we don't want to have fork Envoy
We have use cases that include TCP & UDP that's not included in this piece. This is not Envoy's strong suite — we'd likely have to also incorporate Quilkin to make UDP work
Less moving parts: this is a single binary with fewer points of failure (no need for a gRPC or REST server)
We care a lot about portability & being able to run all of Rivet in a single container for a full local dev environment (single container docs)
Faster & easier to write it from scratch than parse Envoy's obscure documentation; past experiences with Envoy require reading the source to understand how to configure it

rivet-guard-core has a lot of similarities to Cloudflare's Pingora project which supports programmable network services.

Pingora likely would've met our use case with flying colors. It came down to: the MVP for what we need is not that large. rivet-guard-core is only 1,777 physical lines of code (excluding tests) and rivet-guard (which implements routing logic) is only 861 physical lines of code.

Similar to our reason for not using Envoy, we have a lot of edge cases in what our gateway needs to do, so it was a safer bet in time investment (both for MVP and our future projects) to build it from scratch rather than run into a wall with an external library and have to fork + try to get merged.

Despite Rivet Guard's source being relatively small, I don't want to understate the difficulty of writing a reverse proxy. Reverse proxies are easy to implement, but even easier to implement incorrectly. Most of the time developing Rivet Guard was spent writing test cases to handle edge cases around TLS, WebSockets, retries, caching, common DoS attacks, etc.

There are many features that we're looking forward to being able to support having a solid foundation with Rivet Guard.

These are features that we hope to implement in the future that influenced our design:

Rivet Container & Rivet Actor wake on network request: We can dynamically wake containers when a network request comes to them to make workloads cheaper
WebSocket & SSE hibernation for Rivet Actors & Rivet Containers: A neat feature that Cloudflare has on Durable Objects, having a fully custom gateway allows us to maintain WebSocket connections open on the gateway while letting our actors sleep and re-wake on a WebSocket message
Request-based autoscaling for Rivet Containers & Rivet Functions CPU/memory metric based autoscaling (think HPA/VPA) is notoriously slow to respond to requests and can lead to outages if traffic spikes too fast or minimums are not set correctly. You can autoscale much more efficiently and avoid dropping requests with request-based autoscaling that factors the load balancer into the equation. This allows for fewer wasted resources and no outages on spiky traffic.
Analytics on traffic going to your application
Anycast: Roll out our own anycast network with Rivet Guard as the entrypoint

QUIC, HTTP/3, and WebTransport

QUIC We don't support QUIC yet — mainly because we haven't taken the time to figure out what sorts of DoS attacks we need to handle with an entirely new transport — but it will be coming soon.
WebTransport: WebTransport is coming... eventually (looking at you Apple). It will serve as a modern replacement to WebSockets, with support for congestion control, unreliable transport, and much more. This is going to make proxies much trickier, since it includes unreliable transport & relies on QUIC. This is something we'd much rather be able to handle ourselves, since this is fairly bleeding edge and we've found that many load balancers lag behind in proper support for new transports.

Building Rivet Guard has unlocked a massive amount of performance benefits, flexibility with future features, and removed the biggest bottleneck to scaling Rivet.

Key technical improvements include: