A practical CLI tool for tracking JSON file changes over time. Instead of keeping multiple copies of JSON files, this creates compact delta-based archives that preserve the complete history.
This tool solves a simple problem: you have a JSON file that changes regularly, and you want to track its history without storing dozens of full copies.
json-archive creates a .json.archive file next to your original JSON file. Each time you run the tool, it calculates only what changed and appends those deltas to the archive. You get complete history with minimal storage overhead.
The archive format is human-readable JSONL (not binary), making it easy to inspect, debug, and pipe into other scripts or web visualizations.
Perfect for tracking YouTube video metadata over time:
Hackable over efficient: The file format prioritizes human readability and scriptability over binary compactness. You can:
- Open archives in any text editor
- Grep through them for specific changes
- Parse them in JavaScript without special libraries
- Pipe them through standard Unix tools
Minimal workflow changes: Archive files sit next to your original JSON files with a .archive extension. Your existing scripts need minimal modification.
While the core design keeps things simple and readable, the tool does work with compressed archives as a practical concession for those who need it. You can read from and write to gzip, brotli, and zlib compressed files without special flags.
Important caveat: Compressed archives may require rewriting the entire file during updates (depending on the compression format). If your temporary filesystem is full or too small, updates can fail. In that case, manually specify an output destination with -o to write the new archive elsewhere.
This works fine for the happy path with archive files up to a few hundred megabytes, but contradicts the "keep it simple" design philosophy - it's included because it's practically useful.
The format is JSONL with delta-based changes using JSON Pointer paths. For complete technical details about the file format, see the file format specification.
Each observation records:
- What changed (using JSON Pointer paths like /views)
- The old and new values
- When it happened
- A unique observation ID
The tool infers behavior from filenames:
- Info command - View archive metadata and observation timeline
- State command - Retrieve JSON state at specific observations
- File format specification - Technical details about the archive format
Or build from source:
Archives use the .json.archive extension by default:
- data.json -> data.json.archive
- video.info.json -> video.info.json.archive
- config.json -> config.json.archive
This makes it immediately clear which files are archives and which are source files.
The tool uses descriptive diagnostics instead of cryptic error codes:
Diagnostics are categorized as Fatal, Warning, or Info, and the tool exits with non-zero status only for fatal errors.
- Memory usage: Bounded by largest single JSON file, not archive size
- Append speed: Fast - only computes deltas, doesn't re-read entire archive
- Read speed: Linear scan, but snapshots allow seeking to recent state
- File size: Typically 10-30% the size of storing all JSON copies
For very large archives, consider using snapshots (-s flag) to enable faster seeking.
Archives can be loaded directly in web applications:
The format uses only standard JSON. No special parsing required.
This is a practical tool built for real workflow needs. Contributions welcome, especially:
- Additional CLI commands (validate, info, extract)
- Performance optimizations for large archives
- More compression format support
- Better diff algorithms for arrays
Built with Rust for reliability and performance. Designed to be simple enough to understand, powerful enough to be useful.