High-performance hash utility with fast mode

1 hour ago 1

High-performance cryptographic hash utility with SIMD optimization.

  • Algorithms: MD5, SHA-1, SHA-2/3, BLAKE2/3, xxHash3/128
  • SIMD: Automatic hardware acceleration (SSE, AVX, AVX2, AVX-512, NEON)
  • Fast Mode: Quick hashing for large files (samples 300MB)
  • Flexible Input: Files, stdin, or text strings
  • Wildcard Patterns: Support for *, ?, and [...] patterns in file/directory arguments
  • Directory Scanning: Recursive hashing with parallel processing
  • Verification: Compare hashes against stored database
  • Database Comparison: Compare two databases to identify changes, duplicates, and differences
  • .hashignore: Exclude files using gitignore patterns
  • Formats: Standard, hashdeep, JSON
  • Compression: LZMA compression for databases
  • Cross-Platform: Linux, macOS, Windows, FreeBSD
cargo build --release # Hash a file ./target/release/hash myfile.txt -a sha256 # Hash text ./target/release/hash --text "hello world" -a sha256 # Hash from stdin cat myfile.txt | ./target/release/hash -a sha256 # Scan directory ./target/release/hash scan -d ./my_dir -a sha256 -o hashes.db # Verify ./target/release/hash verify -b hashes.db -d ./my_dir # List algorithms ./target/release/hash list
hash myfile.txt -a sha256 # Single algorithm hash myfile.txt -a sha256 -a blake3 # Multiple algorithms hash largefile.iso -f -a blake3 # Fast mode hash myfile.txt -a sha256 -o output.txt # Save to file hash myfile.txt -a sha256 --json # JSON output

Hash multiple files using wildcard patterns:

hash "*.txt" -a sha256 # All .txt files hash "file?.bin" -a sha256 # file1.bin, fileA.bin, etc. hash "[abc]*.jpg" -a sha256 # Files starting with a, b, or c hash "img202405*.jpg" -a sha256 # All images from May 2024

Patterns work with all commands:

hash scan -d "data/*/hashes" -a sha256 -o output.db # Multiple directories hash verify -b "*.db" -d "data/*" --json # Multiple databases/dirs
hash --text "hello world" -a sha256 # Hash text cat myfile.txt | hash -a sha256 # Hash from stdin
hash scan -d /path/to/dir -a sha256 -o hashes.db # Basic hash scan -d /path/to/dir -a sha256 -o hashes.db -p # Parallel hash scan -d /path/to/dir -a sha256 -o hashes.db -f # Fast mode hash scan -d /path/to/dir -a sha256 -o hashes.db -p -f # Both hash scan -d /path/to/dir -a sha256 -o hashes.db --compress # Compressed hash scan -d /path/to/dir -a sha256 -o hashes.db --format hashdeep # Hashdeep
hash verify -b hashes.db -d /path/to/dir # Verify hash verify -b hashes.db.xz -d /path/to/dir # Compressed hash verify -b hashes.db -d /path/to/dir --json # JSON

Output shows: Matches, Mismatches, Missing files, New files

Compare two hash databases to identify changes, duplicates, and differences:

hash compare db1.txt db2.txt # Compare two databases hash compare db1.txt db2.txt -o report.txt # Save report to file hash compare db1.txt db2.txt --format json # JSON output hash compare db1.txt.xz db2.txt.xz # Compare compressed databases hash compare db1.txt db2.txt.xz # Mix compressed and plain

Output shows:

  • Unchanged: Files with same hash in both databases
  • Changed: Files with different hashes
  • Removed: Files in DB1 but not DB2
  • Added: Files in DB2 but not DB1
  • Duplicates: Files with same hash within each database
hash benchmark # Benchmark all algorithms hash benchmark -s 500 # Custom data size hash list # List algorithms hash list --json # JSON output
Command Option Description
FILE File or wildcard pattern to hash (omit for stdin)
-t, --text <TEXT> Hash text string
-a, --algorithm <ALG> Algorithm (default: sha256)
-o, --output <FILE> Write to file
-f, --fast Fast mode (samples 300MB)
--json JSON output
scan -d, --directory <DIR> Directory or wildcard pattern to scan
-a, --algorithm <ALG> Algorithm (default: sha256)
-o, --output <FILE> Output database
-p, --parallel Parallel processing
-f, --fast Fast mode
--format <FMT> standard or hashdeep
--compress LZMA compression
--json JSON output
verify -b, --database <FILE> Database file or wildcard pattern
-d, --directory <DIR> Directory or wildcard pattern to verify
--json JSON output
compare DATABASE1 First database file (supports .xz)
DATABASE2 Second database file (supports .xz)
-o, --output <FILE> Write report to file
--format <FMT> plain-text, json, or hashdeep
benchmark -s, --size <MB> Data size (default: 100)
--json JSON output

Exclude files using gitignore-style patterns:

cat > /path/to/dir/.hashignore << 'EOF' *.log *.tmp build/ node_modules/ !important.log EOF hash scan -d /path/to/dir -a sha256 -o hashes.db

Patterns: *.ext, dir/, !pattern, #comments, **/*.ext

Standard (default):

<hash> <algorithm> <mode> <filepath>

Hashdeep: CSV format with file size, compatible with hashdeep tool

JSON: Structured output for automation

Algorithm Throughput Use Case
xxHash3 10-30 GB/s Non-crypto, max speed
BLAKE3 1-3 GB/s Crypto, fastest
SHA-512 600-900 MB/s Crypto, 64-bit
SHA-256 500-800 MB/s Crypto, common
SHA3-256 200-400 MB/s Post-quantum

Tips:

  • Use -p for parallel (2-4x faster)
  • Use -f for large files (10-100x faster)
  • Use BLAKE3 for fastest crypto
  • Compile with RUSTFLAGS="-C target-cpu=native" for best performance

Fast Mode Speedup:

  • 1 GB: ~7x faster
  • 10 GB: ~67x faster
  • 100 GB: ~667x faster

Samples 300MB (first/middle/last 100MB) instead of entire file.

Good for: Quick checks, large files, backups Not for: Full verification, forensics, small files

# Verify downloaded file hash downloaded-file.iso -a sha256 # Backup verification hash scan -d /data -a sha256 -o backup.db -p hash verify -b backup.db -d /data # Monitor changes hash scan -d /etc/config -a sha256 -o baseline.db hash verify -b baseline.db -d /etc/config # Compare two snapshots hash scan -d /data -a sha256 -o snapshot1.db # ... time passes ... hash scan -d /data -a sha256 -o snapshot2.db hash compare snapshot1.db snapshot2.db -o changes.txt # Find duplicates hash scan -d /media -a sha256 -o media.db hash compare media.db media.db # Compare with itself # Forensic analysis hash scan -d /evidence -a sha3-256 -o evidence.db hash scan -d /evidence -a sha256 -o evidence.txt --format hashdeep # Quick checksums hash large-backup.tar.gz -f -a blake3 hash scan -d /backups -a blake3 -o checksums.db -p -f # Automation hash verify -b hashes.db -d /data --json | jq '.report.mismatches' hash compare db1.db db2.db --format json | jq '.summary'

Recommended:

  • SHA-256: Widely supported, good security
  • BLAKE3: Fastest cryptographic hash
  • SHA3-256: Post-quantum resistant

Deprecated:

  • MD5, SHA-1: Use only for compatibility

Non-crypto (trusted environments):

  • xxHash3/128: Maximum speed

Automatic support for SSE, AVX, AVX2, AVX-512 (x86_64) and NEON (ARM).

Verify: cargo test --release --test simd_verification -- --nocapture

See SIMD_OPTIMIZATION.md for details.

Supported patterns:

  • * - Matches any number of characters (e.g., *.txt, file*)
  • ? - Matches exactly one character (e.g., file?.bin)
  • [...] - Matches any character in brackets (e.g., [abc]*.jpg)

Examples:

hash "*.txt" -a sha256 # All .txt files in current dir hash "data/*.bin" -a sha256 # All .bin files in data/ hash "file?.txt" -a sha256 # file1.txt, fileA.txt, etc. hash "[abc]*.jpg" -a sha256 # Files starting with a, b, or c hash scan -d "backup/*/data" -a sha256 -o db.txt # Multiple directories hash verify -b "*.db" -d "data/*" # All .db files against all data dirs

Notes:

  • Patterns are expanded by the shell or the application
  • If no files match, an error is displayed
  • Multiple matches are processed in sorted order
  • For scan/verify with multiple directories, results are aggregated
Issue Solution
Unsupported algorithm Run hash list to see available algorithms
Permission errors Use sudo hash scan -d /protected/dir ...
Slow performance Use -p for parallel, -f for fast mode, or BLAKE3
Fast mode not working Fast mode only works with files (not stdin/text)
.hashignore not working Check file location: /path/to/dir/.hashignore
Wildcard pattern not matching Ensure pattern is quoted (e.g., "*.txt" not *.txt)
No files match pattern Check pattern syntax and file locations
Read Entire Article