Generalized Consensus: An Alternate Approach to Distributed Durability

3 hours ago 1

Today, we're releasing a series that presents a fresh perspective on consensus algorithms. Rather than treating consensus as a monolithic black box, we propose a conceptual framework that makes these systems more approachable, adaptable, and flexible.

Why Another Take on Consensus?

Consensus algorithms like Paxos and Raft have been foundational to distributed systems for decades. Paxos, while powerful, has been notoriously difficult to understand. Raft improved accessibility but remains a monolithic algorithm that's risky to modify. This has effectively limited our flexibility in adapting consensus systems to modern cloud architectures.

The problem is twofold:

The problem itself is not well-defined - most explanations focus on what consensus does rather than what problem it solves
Research has focused on proving specific algorithms rather than building conceptual frameworks

This series takes a different approach.

What We Cover

A New Definition

We start by redefining consensus around the actual problem it solves:

Consensus solves the problem of Distributed Durability and High Availability.

In simpler terms: A consensus system must ensure that every request is saved elsewhere before it is acknowledged. If there is a failure, the system must have the ability to find the saved requests, complete them, and resume operations.

This definition shifts the focus from the algorithm to the goal, making it easier to reason about different approaches and implementations.

Breaking Free from Majority Quorum

Today's cloud environments have complex topologies with nodes, racks, availability zones, and regions. Yet we're stuck with rigid majority quorum requirements that don't align with these realities.

What if you want durability that requires:

At least one replica in a different availability zone?
Two replicas across any two distinct regions?
A specific combination based on your network topology and cost structure?

The series demonstrates how to accommodate arbitrarily complex durability policies without changing the core algorithm. We introduce the concept of pluggable durability policies that can be specified externally, like a plugin.

Goal-Oriented Rules

Instead of prescribing a specific algorithm, we establish a set of governing rules that any consensus implementation must satisfy:

Durability: Every decision must be made durable according to the policy
Consistency: Decisions must be applied sequentially, with proper revocation and discovery mechanisms

These rules avoid dictating specific implementations, allowing for diverse approaches while maintaining safety guarantees.

Practical Applications

The series isn't purely theoretical. We demonstrate:

How existing algorithms (Paxos, Raft) are special cases of this generalization
A concrete implementation approach inspired by Raft that supports flexible durability policies
How to separate concerns like failure detection into independent components (coordinators)
Practical considerations for building real systems, including handling ruleset changes

The Complete Series

The series consists of 11 parts:

Defining the Problem - Reframing consensus around distributed durability
Building the Foundation - Log replication, durability requirements, and pluggable policies
Governing Rules - The core rules every consensus system must follow
Fulfilling Requests - How leaders process and durably commit requests
Before and After - Finding a way to order events consistently
Revocation and Candidacy - Satisfying the prerequisites for a leader change
Discovery and Propagation - Finding and honoring previously committed decisions
Changing the Rules - Changing the ruleset dynamically
Consistent Reads - How to serve consistent reads
Addenda - Topics we deferred, like health checks, term numbers, etc.
Recap - Bringing it all together with a complete Raft-inspired implementation

Why This Matters for Multigres

This work directly supports our goals for Multigres. Postgres currently lacks a native consensus protocol. Existing solutions appear to use Raft as a black box, which limits their ability to optimize for Postgres's WAL replication model.

By building on this generalized framework, Multigres will support:

Flexible durability policies (cross-AZ, cross-region, custom combinations)
Better integration with Postgres's replication mechanisms
The ability to scale to larger cohort sizes without performance degradation
Native two-phase sync replication as a foundation for consensus

Start Reading

The series is designed to be read sequentially, with each part building on previous concepts. Familiarity with Raft or Paxos will be beneficial, though not required.

Start with Part 1: Defining the Problem

We believe this framework opens new possibilities for consensus systems that can adapt to modern cloud architectures while maintaining the safety guarantees we depend on.

If you have comments or questions, please start a discussion on the Multigres GitHub repository.

Read Entire Article

Generalized Consensus: An Alternate Approach to Distributed Durability

Why Another Take on Consensus?