Clustered PostgreSQL

4 months ago 3

In my continuining quest to have my homelab act as an actual experimentation labratory, I decided to turn my single instanced PostgreSQL server into a cluster with failover. I have a number of machines in my homelab infrastructure at this point that can run compute workloads, with varying degrees of reliability (most of this comes from using old SSDs as the boot disks, which is something I'm replacing), and it made sense to leverage them so I can take one out of service to work on / do science to it.

The goal of this was pretty simple - protect against one of my machines going offline taking other services running on other machines offline. Much in the same way I ditched HAProxy in favour of virtual IPs, I wanted to be able to take a machine out of rotation entirely (whether it be a crash or hardware maintenence) but still be able to use my self hosted things. I go out of my way to test this by randomly disrupting machines, virtual or physical, to make sure I don't have any blindspots.

The first thing to decide was how. There are a few options out there, but I ended up opting for Patroni, which isn't a completely self contained solution but rather a component for building out the end goal. Specifically, it handles the configuration of a local PostgreSQL instance and the coordination with other Patroni instances through an etcd (or other) cluster.

Patroni coordinates with other instances using etcd (or another backing store, but I opted for etcd), which is a distributed and strongly consistent KV store - it's most well known as the database used for Kubernetes, and... well, not much else. Despite the fact it can be run standalone, when it comes to KV stores most reach for Redis/Redict/Valkey or similar. Those stores are great, but come at the cost of consistency and gaurentees around data being available. In other words, if you write a key to them then read that key later, it's possible the key won't exist despite no deletions/expirations of the key being done (especially when used in a clustered setup). This means those options are fast, but ill-suited for situations where you want to be sure the data being written can be read from another instance in the cluster.

All that is to say, along with a PostgreSQL cluster, I also have a fully functional etcd cluster to go with it. The cluster is made up of three virtual machines on different hardware within my lab, with static LAN IPs that they communicate over. Each one has about 512MiB of memory and 2 vCPUs, as the resource demand is quite low.

Connecting to the cluster is done through... HAProxy. Although this time, HAProxy lives in my Kubernetes cluster, with two replicas! Patroni exposes an HTTP server that responds with some basic stats, and also indicates which instance is a secondary (read only) or primary (read and write) by the status code it returns (500 if secondary, 200 if primary). This lets HAProxy check which instance is the primary accepting reads and writes, and forwards all connections to that instance. While I would love to be able to route to the secondaries for read only queries, it's less important than being able to fail over between them so it's taking a backseat for a little while (it's also more important for high performance environments, and less so for mine). The Kubernetes deployment has an associated LoadBalancer service, so it's given an IP on my LAN that I can connect to directly if needed.

To test this, the natural thing was to suddenly power off the primary instance and see what broke. This would also let me see what configuration I needed to update to ensure the new in-cluster HAProxy was used for connections, rather than the IP of the original instance. Generally I've found the best way to test systems like this is to literally stop a machine and see what goes down - if you're scared to do that to a system that is supposed to be highly available, then those concerns should be addressed rather than ignored and the test never performed.

The inevitable question is "why not run PostgreSQL in Kubernetes?", and my answer to that is one that I've held for a long while: Kubernetes is not suitable for stateful workloads. I've softened my opinion on this a bit, but my stance is that if you can't throw away the nodes in a Kubernetes cluster (e.g delete them) and recreate them with no issues, then you may be holding it wrong. Things like only being able to run a single instance of something or relying on local storage (e.g hostPath) are kubesmells. Now granted, there are some exceptions to that opinion, and with any application that properly supports clustering, you'll probably be fine to run it in Kubernetes, stateful or not. I'll continue to avoid it for the time being, but if it makes sense for your use case (along with your usage of Kubernetes) then by all means! It's also entirely possible I move my PostgreSQL database into Kubernetes, although my few attempts at similar efforts have not gone particularly well.

Also, it's possible the actual answer is "because I want to be more flexible with my PostgreSQL configuration and labbing", so don't read too much into it, I guess.

I originally wrote this post as a sort of step-by-step for setting this up, but things like the existing Patroni docs do a fine job of that. If you're interested in setting something like this up in your own lab, I really encourage it! Highly-available PostgreSQL is a very helpful thing to have, and the concepts around it transfer easily to other HA related topics (e.g quorum, concensus, etc).

My next target is making my Vault instance highly available - and that will probably just be moved into Kubernetes, especially since it already uses PostgreSQL to store its data which Vault supports as a HA backend. I'm getting slightly hung up on auto-unseal stuff, since I want to keep everything in-house, but watch this space!

Feedback and comments are always welcome over on the fediverse. You can also help support my ~~addiction~~ adventures in homelabbing through ko-fi.

Read Entire Article