Nicolae Vartolomei · 2025/10
OSWALD is a Write-Ahead Log (WAL) design built exclusively on object storage primitives. It works with any object storage service that provides read-after-write consistency and compare-and-swap operations, including AWS S3, Google Cloud Storage, and Azure Blob Storage.
The design supports checkpointing and garbage collection, making it suitable for State Machine Replication (SMR).
The design has been formally specified and verified using the P programming language.
Supporting code is available at github.com/nvartolomei/oswald.
Table of contents
High-level overview
OSWALD works with 3 types of objects:
- Manifest - tracks the latest checkpoint (snapshot) and garbage collection progress using Log Sequence Numbers (LSNs)
- Snapshots - user-defined state snapshots for optimized recovery
- Chunks - log content
The manifest object version, along with its content, is used for synchronizing readers, writers, and the garbage collection process. It is the only mutable object in the system.
Appending
Appending to the WAL requires two round trips: a PUT-If-None-Match to create the next chunk, followed by a GET-If-None-Match on the manifest.1
Tailing
Tailing requires one round trip per new chunk, plus two additional round trips: a GET for the next expected chunk (404 Not Found when no more chunks exist) and a GET-If-None-Match on the manifest.
Initialization
Readers and writers initialize by:
- GET the manifest
- GET the snapshot (if it exists)
- GET each chunk not covered by the snapshot
- GET-If-None-Match the manifest
With local snapshots, this can be optimized to a tailing-like sequence.
Concurrency conflicts
Writer-Writer conflicts
When multiple writers attempt concurrent writes, conflicts are detected using PUT-If-None-Match when creating chunks.
When a chunk for the same LSN already exists, object storage returns a 409 Conflict error. The writer must then follow the Tailing protocol for catch-up recovery and retry.
Writer-Garbage Collector conflicts
When garbage collection is active, the PUT-If-None-Match mechanism alone is insufficient.
If the garbage collector removed chunk n (GC watermark above n) and a writer is behind, PUT-If-None-Match will succeed, potentially causing write loss and log divergence. To prevent this, after creating a chunk but before acknowledging the operation, the writer must verify the GC watermark is below its LSN using GET-If-None-Match on the manifest. If the watermark has advanced past the writer’s LSN, the writer must restart and follow the Initialization protocol.
Tailer-Garbage Collector conflicts
Similar conflicts occur when tailers fall behind the GC watermark during tailing, catch-up recovery, or initialization. When detected, tailers must restart with the Initialization protocol.
Verification
The design has been formally specified and verified using the P programming language. The specification is available at github.com/nvartolomei/oswald and includes an increment-only counter implemented as a Replicated State Machine over OSWALD.
.png)

