Show HN: TidesDB – High-performance durable, transactional embeddable database
2 hours ago
1
TidesDB is a fast and efficient key value storage engine library written in C.
The underlying data structure is based on a log-structured merge-tree (LSM-tree).
It is not a full-featured database, but rather a library that can be used to build a database atop of or used as a standalone key-value/column store.
ACID Transactions - Atomic, consistent, isolated (Read Committed), and durable. Transactions support multiple operations across column families. Writers are serialized per column family ensuring atomicity, while COW provides consistency for concurrent readers.
Optimized Concurrency - Writers don't block readers. Readers never block other readers. Background operations wont affect active transactions.
Column Families - Isolated key-value stores. Each column family has its own memtable, SSTables, and WALs.
Bidirectional Iterators - Iterate forward and backward over key-value pairs with heap-based merge-sort across memtable and SSTables. Reference counting prevents premature deletion during iteration.
Write-Ahead Log (WAL) - Durability through WAL. Automatic recovery on startup reconstructs memtables from WALs.
Background Compaction - Automatic background compaction when SSTable count reaches configured max per column family.
Bloom Filters - Reduce disk reads by checking key existence before reading SSTables. Configurable false positive rate.
Compression - Snappy, LZ4, or ZSTD compression for SSTables and WAL entries. Configurable per column family.
TTL Support - Time-to-live for key-value pairs. Expired entries automatically skipped during reads.
Simple API - Clean, easy-to-use C API. Returns 0 on success, -n on error.
Skip List Memtable - COW and atomic skip list for in-memory storage with configurable max level and probability.
Cross-Platform - Linux, macOS, and Windows support with platform abstraction layer.
Sorted Binary Hash Array (SBHA) - Fast SSTable lookups. Direct key-to-block offset mapping without full SSTable scans.
Tombstones - Efficient deletion through tombstone markers. Removed during compaction.
LRU File Handle Cache - Configurable LRU cache for open file handles. Limits system resources while maintaining performance. Set max_open_file_handles to control cache size (0 = disabled).
Using cmake to build the shared library.
rm -rf build && cmake -S . -B build
cmake --build build
cmake --install build
# Production build
rm -rf build && cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DTIDESDB_WITH_SANITIZER=OFF -DTIDESDB_BUILD_TESTS=OFF
cmake --build build --config Release
cmake --install build
# On linux run ldconfig to update the shared library cache
ldconfig
Option 1 MinGW-w64 (Recommended for Windows)
MinGW-w64 provides a GCC-based toolchain with better C11 support and POSIX compatibility.
# Clean previous buildRemove-Item-Recurse -Force build -ErrorAction SilentlyContinue
# Configure with MinGW
cmake -S .-B build -G "MinGW Makefiles"-DCMAKE_C_COMPILER=gcc -DCMAKE_TOOLCHAIN_FILE=C:\vcpkg\scripts\buildsystems\vcpkg.cmake
# Build
cmake --build build
# Run tests
cd build
ctest --verbose # or use --output-on-failure to only show failures
# Clean previous buildRemove-Item-Recurse -Force build -ErrorAction SilentlyContinue
# Configure with MSVC
cmake -S .-B build -DCMAKE_TOOLCHAIN_FILE=C:\vcpkg\scripts\buildsystems\vcpkg.cmake
# Build (Debug or Release)
cmake --build build --config Debug
# or
cmake --build build --config Release
# Run tests
cd build
ctest -C Debug --verbose
# or
ctest -C Release --verbose
Note MSVC requires Visual Studio 2019 16.8 or later for C11 atomics support (/experimental:c11atomics). Both Debug and Release builds are fully supported.
You need cmake and a C compiler.
You also require the snappy, lz4, zstd, and openssl libraries.
Join the TidesDB Discord Community to ask questions, work on development, and discuss the future of TidesDB.
#include<tidesdb/tidesdb.h>/* You can use other components of TidesDB such as skip list, bloom filter etc.. under tidesdb/ this also prevents collisions. */
TidesDB provides detailed error codes for production use. All functions return 0 on success or a negative error code on failure.
externint_tidesdb_debug_enabled; /* Global debug flag *//* Enable debug logging */_tidesdb_debug_enabled=1;
/* Your operations here - debug logs will be written to stderr *//* Disable debug logging */_tidesdb_debug_enabled=0;
Output
Debug logs are written to stderr with the format
[TidesDB DEBUG] filename:line: message
Redirect to file
./your_program 2> tidesdb_debug.log # Redirect stderr to file
Column families are isolated key-value stores. Use the config struct for customization or use defaults.
/* Create with default configuration */tidesdb_column_family_config_tcf_config=tidesdb_default_column_family_config();
if (tidesdb_create_column_family(db, "my_cf", &cf_config) !=0)
{
/* Handle error */return-1;
}
Custom configuration example
tidesdb_column_family_config_tcf_config= {
.memtable_flush_size=128*1024*1024, /* 128MB */
.max_sstables_before_compaction=512, /* trigger compaction at 512 SSTables (min 2 required) */
.compaction_threads=4, /* use 4 threads for parallel compaction (0 = single-threaded) */
.max_level=12, /* skip list max level */
.probability=0.25f, /* skip list probability */
.compressed=1, /* enable compression */
.compress_algo=COMPRESS_LZ4, /* use LZ4 */
.bloom_filter_fp_rate=0.01, /* 1% false positive rate */
.enable_background_compaction=1, /* enable background compaction */
.background_compaction_interval=1000000, /* check every 1000000 microseconds (1 second) */
.use_sbha=1, /* use sorted binary hash array */
.sync_mode=TDB_SYNC_BACKGROUND, /* background fsync */
.sync_interval=1000, /* fsync every 1000ms (1 second) */
.comparator_name=NULL/* NULL = use default "memcmp" */
};
if (tidesdb_create_column_family(db, "my_cf", &cf_config) !=0)
{
/* Handle error */return-1;
}
Using custom comparator
/* Register custom comparator first (see examples/custom_comparator.c) */tidesdb_register_comparator("reverse", my_reverse_compare);
tidesdb_column_family_config_tcf_config=tidesdb_default_column_family_config();
cf_config.comparator_name="reverse"; /* use registered comparator */if (tidesdb_create_column_family(db, "sorted_cf", &cf_config) !=0)
{
/* Handle error */return-1;
}
if (tidesdb_drop_column_family(db, "my_cf") !=0)
{
/* Handle error */return-1;
}
Retrieve a column family pointer to use in operations.
tidesdb_column_family_t*cf=tidesdb_get_column_family(db, "my_cf");
if (cf==NULL)
{
/* Column family not found */return-1;
}
Get all column family names in the database.
char**names=NULL;
intcount=0;
if (tidesdb_list_column_families(db, &names, &count) ==0)
{
printf("Found %d column families:\n", count);
for (inti=0; i<count; i++)
{
printf(" - %s\n", names[i]);
free(names[i]); /* Free each name */
}
free(names); /* Free the array */
}
Full configuration (compression, bloom filters, sync mode, etc.)
All operations in TidesDB v1 are done through transactions for ACID guarantees.
Basic transaction
tidesdb_txn_t*txn=NULL;
if (tidesdb_txn_begin(db, &txn) !=0)
{
return-1;
}
/* Put a key-value pair */constuint8_t*key= (uint8_t*)"mykey";
constuint8_t*value= (uint8_t*)"myvalue";
if (tidesdb_txn_put(txn, "my_cf", key, 5, value, 7, -1) !=0)
{
tidesdb_txn_free(txn);
return-1;
}
/* Commit the transaction */if (tidesdb_txn_commit(txn) !=0)
{
tidesdb_txn_free(txn);
return-1;
}
tidesdb_txn_free(txn);
With TTL (time-to-live)
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin(db, &txn);
constuint8_t*key= (uint8_t*)"temp_key";
constuint8_t*value= (uint8_t*)"temp_value";
/* TTL is Unix timestamp (seconds since epoch) - absolute expiration time */time_tttl=time(NULL) +60; /* Expires 60 seconds from now *//* Use -1 for no expiration */tidesdb_txn_put(txn, "my_cf", key, 8, value, 10, ttl);
tidesdb_txn_commit(txn);
tidesdb_txn_free(txn);
TTL Examples
/* No expiration */time_tttl=-1;
/* Expire in 5 minutes */time_tttl=time(NULL) + (5*60);
/* Expire in 1 hour */time_tttl=time(NULL) + (60*60);
/* Expire at specific time (e.g., midnight) */time_tttl=1730592000; /* Specific Unix timestamp */
Getting a key-value pair
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin_read(db, &txn); /* Read-only transaction */constuint8_t*key= (uint8_t*)"mykey";
uint8_t*value=NULL;
size_tvalue_size=0;
if (tidesdb_txn_get(txn, "my_cf", key, 5, &value, &value_size) ==0)
{
/* Use value */printf("Value: %.*s\n", (int)value_size, value);
free(value);
}
tidesdb_txn_free(txn);
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin(db, &txn);
/* Multiple operations in one transaction */tidesdb_txn_put(txn, "my_cf", (uint8_t*)"key1", 4, (uint8_t*)"value1", 6, -1);
tidesdb_txn_put(txn, "my_cf", (uint8_t*)"key2", 4, (uint8_t*)"value2", 6, -1);
tidesdb_txn_delete(txn, "my_cf", (uint8_t*)"old_key", 7);
/* Commit atomically - all or nothing */if (tidesdb_txn_commit(txn) !=0)
{
/* On error, transaction is automatically rolled back */tidesdb_txn_free(txn);
return-1;
}
tidesdb_txn_free(txn);
Transaction rollback
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin(db, &txn);
tidesdb_txn_put(txn, "my_cf", (uint8_t*)"key", 3, (uint8_t*)"value", 5, -1);
/* Decide to rollback instead of commit */tidesdb_txn_rollback(txn);
tidesdb_txn_free(txn);
/* No changes were applied */
Iterators provide efficient forward and backward traversal over key-value pairs.
Forward iteration
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin_read(db, &txn);
tidesdb_iter_t*iter=NULL;
if (tidesdb_iter_new(txn, "my_cf", &iter) !=0)
{
tidesdb_txn_free(txn);
return-1;
}
/* Seek to first entry */tidesdb_iter_seek_to_first(iter);
while (tidesdb_iter_valid(iter))
{
uint8_t*key=NULL;
size_tkey_size=0;
uint8_t*value=NULL;
size_tvalue_size=0;
if (tidesdb_iter_key(iter, &key, &key_size) ==0&&tidesdb_iter_value(iter, &value, &value_size) ==0)
{
/* Use key and value */printf("Key: %.*s, Value: %.*s\n",
(int)key_size, key, (int)value_size, value);
free(key);
free(value);
}
tidesdb_iter_next(iter);
}
tidesdb_iter_free(iter);
tidesdb_txn_free(txn);
Backward iteration
tidesdb_txn_t*txn=NULL;
tidesdb_txn_begin_read(db, &txn);
tidesdb_iter_t*iter=NULL;
tidesdb_iter_new(txn, "my_cf", &iter);
/* Seek to last entry */tidesdb_iter_seek_to_last(iter);
while (tidesdb_iter_valid(iter))
{
/* Process entries in reverse order */tidesdb_iter_prev(iter);
}
tidesdb_iter_free(iter);
tidesdb_txn_free(txn);
Iterator Reference Counting and Compaction Safety
TidesDB uses atomic reference counting to ensure safe concurrent access between iterators and compaction
Automatic Reference Counting - When an iterator is created, it acquires references on all active SSTables, preventing them from being deleted
Copy-on-Write (COW) Semantics - Compaction creates new merged SSTables and immediately replaces old ones in the active array, but old SSTables remain in memory for active iterators
Non-Blocking Compaction - Compaction completes immediately without waiting for iterators to finish, ensuring high throughput
Automatic Cleanup - When an iterator is freed, it releases its references. SSTables with zero references are automatically deleted (both file and memory)
Heap-Based Merge - Iterators use a min-heap (forward) or max-heap (backward) to efficiently merge-sort entries from multiple sources
How it works
Iterator creation acquires references on all SSTables (increments ref_count)
Compaction creates new merged SSTables and swaps them into the active array
Compaction releases its reference on old SSTables (decrements ref_count)
Old SSTables remain accessible to active iterators (ref_count > 0)
When iterator is freed, it releases references (decrements ref_count)
When ref_count drops to 0, the SSTable file is deleted and memory is freed
tidesdb_iter_t*iter=NULL;
tidesdb_iter_new(txn, "my_cf", &iter); /* Acquires references on SSTables */tidesdb_iter_seek_to_first(iter);
/* Compaction can occur here - new SSTables replace old ones *//* But iterator still has valid references to old SSTables */while (tidesdb_iter_valid(iter))
{
uint8_t*key=NULL, *value=NULL;
size_tkey_size=0, value_size=0;
tidesdb_iter_key(iter, &key, &key_size);
tidesdb_iter_value(iter, &value, &value_size);
/* Process data from snapshot */free(key);
free(value);
tidesdb_iter_next(iter);
}
tidesdb_iter_free(iter); /* Releases references, triggers cleanup if ref_count == 0 */
Benefits
True Snapshot Isolation - Iterators see a consistent snapshot of data from creation time
No Blocking - Compaction and iteration proceed independently without waiting
Reconstructs in-memory state from persisted WAL entries
Continues normal operation with recovered data
What Gets Recovered
All committed transactions that were written to WAL
Uncommitted transactions are discarded (not in WAL)
Memtables that were being flushed when crash occurred
Example Recovery Scenario
Before Crash:
Active Memtable (ID: 2) → wal_2.log [100 entries]
Flushing Memtable (ID: 1) → wal_1.log [partially written to sstable_1.sst]
SSTable: sstable_0.sst
After Recovery:
Recovered Memtable (ID: 1) from wal_1.log [fully recovered]
Recovered Memtable (ID: 2) from wal_2.log [fully recovered]
SSTable: sstable_0.sst
Next flush will create sstable_1.sst and sstable_2.sst
tidesdb_drop_column_family(db, "my_cf");
/* Deletes mydb/my_cf/ directory and all contents (WALs, SSTables) */
Manual cleanup (if needed)
# Backup before cleanup
cp -r mydb mydb_backup
# Remove specific column family
rm -rf mydb/old_cf/
# Clean up old database
rm -rf mydb/
# Check total database size
du -sh mydb/
# Check per-column-family size
du -sh mydb/*/
# Count WAL files (should be 1-2 per CF normally)
find mydb/ -name "wal_*.log"| wc -l
# Count SSTable files
find mydb/ -name "sstable_*.sst"| wc -l
# List largest SSTables
find mydb/ -name "sstable_*.sst" -exec ls -lh {} \;| sort -k5 -hr | head -10
Backup Strategy
# Stop writes, flush all memtables, then backup# In your application
tidesdb_flush_memtable(cf);# Force flush before backup# Then backup
tar -czf mydb_backup.tar.gz mydb/
Disk Space Monitoring
Monitor WAL file count - typically 1-3 per column family (1 active + 1-2 in flush queue)
Many WAL files (>5) may indicate flush backlog, slow I/O, or configuration issue
Monitor SSTable count - triggers compaction at max_sstables_before_compaction
Set appropriate memtable_flush_size based on write patterns and flush speed
Performance Tuning
Larger memtable_flush_size = fewer, larger SSTables = less compaction
Smaller memtable_flush_size = more, smaller SSTables = more compaction
Adjust max_sstables_before_compaction based on read/write ratio
Use enable_background_compaction for automatic maintenance
TidesDB is designed for high concurrency with minimal blocking
Reader-Writer Locks
Each column family has a reader-writer lock
Multiple readers can read concurrently - no blocking between readers
Writers don't block readers - readers can access data while writes are in progress
Writers block other writers - only one writer per column family at a time