A high-performance, ultra-compact binary format for UK postcode geolocation data. Compresses 1.79M postcodes from ~50MB CSV to just 6.2MB binary (88% compression) with 920K+ lookups/second and O(1) exact lookups.
import*asfsfrom"fs";import{PostcodeClient}from"./src/PostcodeClient";// Load the binary databaseconstbuffer=fs.readFileSync("postcodes.pcod");constclient=newPostcodeClient(buffer);// Exact postcode lookup (O(1))constresult=client.lookup("ST6 5AA");if(result){console.log(`${result.postcode}: ${result.lat}, ${result.lon}`);}// Enumerate all postcodes in an outward (prefix query)constpostcodes=client.enumerateOutward("ST6");console.log(`Found ${postcodes.length} postcodes in ST6`);// Get database statisticsconststats=client.getStats();console.log(`Database: ${stats.totalPostcodes} postcodes, ${stats.fileSize} bytes`);// Check if postcode existsconstisValid=client.isValidPostcode("M1 1AA");// Find outwards by prefixconstmanchesterOutwards=client.findNearbyOutwards("M");
This project uses the ONS Postcode Database that can be downlaod from https://geoportal.statistics.gov.uk/datasets/e14b1475ecf74b58804cf667b6740706, Once you have downloaded the dataset you can put the full ONS_{month}_{year}_UK.csv file in the data/ directory and run the following command
to extract just the valid postcodes, lat and long values into a file called postcodes.csv, this file is what you can then use to build the binary database
Ensure that you have a file called postcodes.csv in the root directory, it should contain 3 columns only
Execute the yarn run build command, you should see an output like { totalOutwards: 2943, totalPostcodes: 1790884, fileSize: 6509848 } and a new file called postcodes.pcod apear in the root of the directory.
Performance & Compression
Metric
Value
Details
Input Size
~50MB
Raw CSV with 1.79M postcodes
Output Size
6.2MB
Binary format (.pcod file)
Compression Ratio
88%
8:1 compression ratio
Entropy Efficiency
7.8/8.0
Near-optimal bit utilization
Coordinate Precision
1e-4°
~11m resolution at UK latitudes
Average Error
0.35m
Quantization error vs source
Operation
Performance
Details
Exact Lookup
920K+ ops/sec
O(1) constant time
Latency
<1ms
Per lookup on M3 Mac
Memory Usage
<10MB
Total runtime footprint
Prefix Query
Variable
Depends on outward size
Build Time
1.65s
For 1.79M postcodes
Below is the output from yarn ts-node scripts/build-and-test.ts on a Mac Pro M3, which builds the database and validates every postcode lookup:
🏗️ Building postcode database...
✅ Build completed in 1.65s
🧪 Loading database and starting tests...
🔍 Testing postcode lookups...
============================================================
POSTCODE DATABASE TEST REPORT
============================================================
📊 PROCESSING STATISTICS:
Total rows processed: 1,790,884
Valid rows: 1,790,884
Invalid rows: 0
Invalid rate: 0.00%
🔍 LOOKUP STATISTICS:
Successful lookups: 1,790,884
Failed lookups: 0
Lookup success rate: 100.00%
📍 COORDINATE ACCURACY:
Coordinate mismatches: 0
Accuracy rate: 100.00%
Max coordinate error: 0.66m
Average coordinate error: 0.35m
⏱️ PERFORMANCE:
Build time: 1.65s
Test time: 1.95s
Total time: 3.59s
Lookup rate: 920,763 lookups/sec
The binary format achieves exceptional compression through:
Bit-packed coordinate deltas: Variable-width encoding based on sector ranges
Adaptive storage modes: Bitmap vs list encoding based on density
Quantized coordinates: 1e-4° precision (sufficient for postcode accuracy)
Entropy optimization: 7.8/8.0 bits/byte efficiency (97.5% of theoretical maximum)
Entropy Distribution:
95.8% of data chunks achieve >7.5 bits/byte entropy
Coordinate data sections: 7.8 bits/byte (highly compressed)
Index sections: 6.3 bits/byte (structured but efficient)