High-performance hash utility with fast mode

1 hour ago 1

High-performance cryptographic hash utility with SIMD optimization.

Algorithms: MD5, SHA-1, SHA-2/3, BLAKE2/3, xxHash3/128
SIMD: Automatic hardware acceleration (SSE, AVX, AVX2, AVX-512, NEON)
Fast Mode: Quick hashing for large files (samples 300MB)
Flexible Input: Files, stdin, or text strings
Wildcard Patterns: Support for *, ?, and [...] patterns in file/directory arguments
Directory Scanning: Recursive hashing with parallel processing
Verification: Compare hashes against stored database
Database Comparison: Compare two databases to identify changes, duplicates, and differences
.hashignore: Exclude files using gitignore patterns
Formats: Standard, hashdeep, JSON
Compression: LZMA compression for databases
Cross-Platform: Linux, macOS, Windows, FreeBSD

cargo build --release # Hash a file ./target/release/hash myfile.txt -a sha256 # Hash text ./target/release/hash --text "hello world" -a sha256 # Hash from stdin cat myfile.txt | ./target/release/hash -a sha256 # Scan directory ./target/release/hash scan -d ./my_dir -a sha256 -o hashes.db # Verify ./target/release/hash verify -b hashes.db -d ./my_dir # List algorithms ./target/release/hash list

hash myfile.txt -a sha256 # Single algorithm hash myfile.txt -a sha256 -a blake3 # Multiple algorithms hash largefile.iso -f -a blake3 # Fast mode hash myfile.txt -a sha256 -o output.txt # Save to file hash myfile.txt -a sha256 --json # JSON output

Hash multiple files using wildcard patterns:

hash "*.txt" -a sha256 # All .txt files hash "file?.bin" -a sha256 # file1.bin, fileA.bin, etc. hash "[abc]*.jpg" -a sha256 # Files starting with a, b, or c hash "img202405*.jpg" -a sha256 # All images from May 2024

Patterns work with all commands:

hash scan -d "data/*/hashes" -a sha256 -o output.db # Multiple directories hash verify -b "*.db" -d "data/*" --json # Multiple databases/dirs

hash --text "hello world" -a sha256 # Hash text cat myfile.txt | hash -a sha256 # Hash from stdin

hash scan -d /path/to/dir -a sha256 -o hashes.db # Basic hash scan -d /path/to/dir -a sha256 -o hashes.db -p # Parallel hash scan -d /path/to/dir -a sha256 -o hashes.db -f # Fast mode hash scan -d /path/to/dir -a sha256 -o hashes.db -p -f # Both hash scan -d /path/to/dir -a sha256 -o hashes.db --compress # Compressed hash scan -d /path/to/dir -a sha256 -o hashes.db --format hashdeep # Hashdeep

hash verify -b hashes.db -d /path/to/dir # Verify hash verify -b hashes.db.xz -d /path/to/dir # Compressed hash verify -b hashes.db -d /path/to/dir --json # JSON

Output shows: Matches, Mismatches, Missing files, New files

Compare two hash databases to identify changes, duplicates, and differences:

hash compare db1.txt db2.txt # Compare two databases hash compare db1.txt db2.txt -o report.txt # Save report to file hash compare db1.txt db2.txt --format json # JSON output hash compare db1.txt.xz db2.txt.xz # Compare compressed databases hash compare db1.txt db2.txt.xz # Mix compressed and plain

Output shows:

Unchanged: Files with same hash in both databases
Changed: Files with different hashes
Removed: Files in DB1 but not DB2
Added: Files in DB2 but not DB1
Duplicates: Files with same hash within each database

hash benchmark # Benchmark all algorithms hash benchmark -s 500 # Custom data size hash list # List algorithms hash list --json # JSON output

Command Option Description

	FILE	File or wildcard pattern to hash (omit for stdin)
	-t, --text <TEXT>	Hash text string
	-a, --algorithm <ALG>	Algorithm (default: sha256)
	-o, --output <FILE>	Write to file
	-f, --fast	Fast mode (samples 300MB)
	--json	JSON output
scan	-d, --directory <DIR>	Directory or wildcard pattern to scan
	-a, --algorithm <ALG>	Algorithm (default: sha256)
	-o, --output <FILE>	Output database
	-p, --parallel	Parallel processing
	-f, --fast	Fast mode
	--format <FMT>	standard or hashdeep
	--compress	LZMA compression
	--json	JSON output
verify	-b, --database <FILE>	Database file or wildcard pattern
	-d, --directory <DIR>	Directory or wildcard pattern to verify
	--json	JSON output
compare	DATABASE1	First database file (supports .xz)
	DATABASE2	Second database file (supports .xz)
	-o, --output <FILE>	Write report to file
	--format <FMT>	plain-text, json, or hashdeep
benchmark	-s, --size <MB>	Data size (default: 100)
	--json	JSON output

Exclude files using gitignore-style patterns:

cat > /path/to/dir/.hashignore << 'EOF' *.log *.tmp build/ node_modules/ !important.log EOF hash scan -d /path/to/dir -a sha256 -o hashes.db

Patterns: *.ext, dir/, !pattern, #comments, **/*.ext

Standard (default):

Hashdeep: CSV format with file size, compatible with hashdeep tool

JSON: Structured output for automation

Algorithm Throughput Use Case

xxHash3	10-30 GB/s	Non-crypto, max speed
BLAKE3	1-3 GB/s	Crypto, fastest
SHA-512	600-900 MB/s	Crypto, 64-bit
SHA-256	500-800 MB/s	Crypto, common
SHA3-256	200-400 MB/s	Post-quantum

Tips:

Use -p for parallel (2-4x faster)
Use -f for large files (10-100x faster)
Use BLAKE3 for fastest crypto
Compile with RUSTFLAGS="-C target-cpu=native" for best performance

Fast Mode Speedup:

1 GB: ~7x faster
10 GB: ~67x faster
100 GB: ~667x faster

Samples 300MB (first/middle/last 100MB) instead of entire file.

Good for: Quick checks, large files, backups Not for: Full verification, forensics, small files

# Verify downloaded file hash downloaded-file.iso -a sha256 # Backup verification hash scan -d /data -a sha256 -o backup.db -p hash verify -b backup.db -d /data # Monitor changes hash scan -d /etc/config -a sha256 -o baseline.db hash verify -b baseline.db -d /etc/config # Compare two snapshots hash scan -d /data -a sha256 -o snapshot1.db # ... time passes ... hash scan -d /data -a sha256 -o snapshot2.db hash compare snapshot1.db snapshot2.db -o changes.txt # Find duplicates hash scan -d /media -a sha256 -o media.db hash compare media.db media.db # Compare with itself # Forensic analysis hash scan -d /evidence -a sha3-256 -o evidence.db hash scan -d /evidence -a sha256 -o evidence.txt --format hashdeep # Quick checksums hash large-backup.tar.gz -f -a blake3 hash scan -d /backups -a blake3 -o checksums.db -p -f # Automation hash verify -b hashes.db -d /data --json | jq '.report.mismatches' hash compare db1.db db2.db --format json | jq '.summary'

Recommended:

SHA-256: Widely supported, good security
BLAKE3: Fastest cryptographic hash
SHA3-256: Post-quantum resistant

Deprecated:

MD5, SHA-1: Use only for compatibility

Non-crypto (trusted environments):

xxHash3/128: Maximum speed

Automatic support for SSE, AVX, AVX2, AVX-512 (x86_64) and NEON (ARM).

Verify: cargo test --release --test simd_verification -- --nocapture

See SIMD_OPTIMIZATION.md for details.

Supported patterns:

* - Matches any number of characters (e.g., *.txt, file*)
? - Matches exactly one character (e.g., file?.bin)
[...] - Matches any character in brackets (e.g., [abc]*.jpg)

Examples:

hash "*.txt" -a sha256 # All .txt files in current dir hash "data/*.bin" -a sha256 # All .bin files in data/ hash "file?.txt" -a sha256 # file1.txt, fileA.txt, etc. hash "[abc]*.jpg" -a sha256 # Files starting with a, b, or c hash scan -d "backup/*/data" -a sha256 -o db.txt # Multiple directories hash verify -b "*.db" -d "data/*" # All .db files against all data dirs

Notes:

Patterns are expanded by the shell or the application
If no files match, an error is displayed
Multiple matches are processed in sorted order
For scan/verify with multiple directories, results are aggregated

Issue Solution

Unsupported algorithm	Run hash list to see available algorithms
Permission errors	Use sudo hash scan -d /protected/dir ...
Slow performance	Use -p for parallel, -f for fast mode, or BLAKE3
Fast mode not working	Fast mode only works with files (not stdin/text)
.hashignore not working	Check file location: /path/to/dir/.hashignore
Wildcard pattern not matching	Ensure pattern is quoted (e.g., ".txt" not .txt)
No files match pattern	Check pattern syntax and file locations

Read Entire Article

High-performance hash utility with fast mode

Related

Google Scholar Labs

Show HN: Fluxion – Rust async stream composition supporting ...

Forget the 80-20 Rule. Follow the 1-50 Rule Instead (2019)