🤖
AI translation tools were used to assist with language clarity, as our research team are not native English speakers. We believe the quality of our research and findings will speak for themselves
Pkgforge hosts the world's largest collection of prebuilt, static binaries that work everywhere without dependencies. While our main repos include hand-picked packages and manually maintained by our team, we had an ambitious idea: what if we could automatically harvest CLI tools from ecosystems like Rust's crates.io, build them as static binaries, and made them available to everyone?
Instead of manually curating every package, we decided to tap into existing package ecosystems and automate the entire process. After two weeks of intensive development and countless iterations, we made this idea a reality.
Crates.io provides api access for individual crate lookups and bulk operations. Initially, our script iterated through the first 1,000 pages (sorted by downloads) with 100 crates per page, yielding approximately 111,000 crates. However, we soon encountered a significant bottleneck: we needed to query each crate individually to determine if it belonged to the command-line-utilities, or produced executables, i.e. contained [[bin]] in their manifest.
This approach proved impractical as we quickly hit rate limits and potentially violated the Usage Policy. Fortunately, RFC-3463 came to our rescue. Crates.io provides periodic database dumps at static.crates.io/db-dump.tar.gz . We quickly drafted a nifty cli using dtolnay/db-dump

Then it was just a matter of parsing this with a bit of jq, & automating it via GitHub Actions. Our workflow now generates all the data we will need, automatically.
Since we ended up with over 111,000 crates, we needed to set some constraints & filter for what we actually wanted to build:
Should either be of category command-line-utilities: categories = ["command-line-utilities"]
Must be updated within the last year, i.e > 2024-01-01
We ended up with ~ 10,000 crates that we now planned to compile.
To achieve truly portable, optimized, and statically linked binaries, we applied the following comprehensive build constraints:
[+] Flags: -C target-feature=+crt-static \ -C default-linker-libraries=yes \ -C link-self-contained=yes \ -C prefer-dynamic=no \ -C embed-bitcode=yes \ -C lto=yes \ -C opt-level=3 \ -C debuginfo=none \ -C strip=symbols \ -C link-arg=-Wl,-S \ -C link-arg=-Wl,--build-id=none \ -C link-arg=-Wl,--discard-all \ -C link-arg=-Wl,--strip-allStatically Linked: -C target-feature=+crt-static
Self Contained: -C link-self-contained=yes
All Features: --all-features
Link Time Optimization: -C lto=yes
All Optimizations: -C opt-level=3
Stripped: -C debuginfo=none -C strip=symbols
No System Libraries: Crates with system library dependencies will fail by design, as we target pure Rust implementations
error: could not find system library 'openssl' required by the 'openssl-sys' crate error: Could not find system library 'sqlite3' error: pkg-config has not been configured to support cross-compilation
With over 10,000 crates to build on individual GitHub Actions runners, speed was paramount. While Cargo offers cross compilation features, it requires significant setup overhead. We needed a solution that worked out of the box.
Our heavy docker images used for official packages consumed 2-3 minutes just for pulling and extraction, making them unsuitable for this scale. This left us with rust-cross/cargo-zigbuild & cross-rs/cross. After some local testing, we decided to use Cross as it supported all the targets we needed & worked as advertised: “Zero setup” cross compilation
We also used jpeddicord/askalono to automatically detect & copy over licenses.
cross +nightly build --target "${RUST_TARGET}" -Z unstable-options \ --all-features \ --artifact-dir="${C_ARTIFACT_DIR}" \ --jobs="$(($(nproc)+1))" \ --release \ --verboseWhile Soar supports any \Unix-based Distro*, due to lack of CI support for other Unix Kernel on GitHub Runners (natively, not VMs), we are limited to Linux only. We further refined our target matrix by excluding architectures approaching end-of-life:
| HOST_TRIPLET | RUST_TARGET |
| aarch64-Linux | aarch64-unknown-linux-musl |
| loongarch64-Linux | loongarch64-unknown-linux-musl |
| riscv64-Linux | riscv64gc-unknown-linux-musl |
| x86_64-Linux | x86_64-unknown-linux-musl |
We are aware of issues like https://github.com/rust-lang/cargo/issues/13897, so we wanted this to be as secure as our official repositories, by ensuring:
Crates are downloaded from crates.io, like the official Cargo does.
CI/CD run on GitHub Actions, with temporary, scoped tokens per package
Build Logs are viewable using: soar log ${PKG_NAME}
Build Src is downloadable by downloading: {GHCR_PKG}-srcbuild-${BUILD_ID}
Artifact Attestation & Build Provenance are created/updated per build.
Checksums are generated (& verified at install time by Soar) for each & every artifact per build.
These measures ensure that even if a malicious crate attempts to compromise the system, its impact is isolated and cannot affect other crates' integrity.
10,000 multiplied by 4 targets, meant we would need to run ~ 40,000 instances of CI & also handle metadata, sanity checks, uploading to ghcr, all at the same time. We also set up a discord webhook to stream real-time progress updates to our discord server.


Build Success vs. Failure
We approached this project with optimistic expectations but encountered a sobering reality. Out of approximately 10,000 crates queued for building:
🏗️ Build Pipeline by Success Rate ──────────────────────────────────────────────────────────────────────── ✅ Queued ████████████████████████████████████████ 10,033 (100.0%) ⚙️ Built ███████████████████████████ 5,779 (57.60%) ❌ Failed ████████████ 4,254 (42.40%) ────────────────────────────────────────────────────────────────────────So what went wrong? We sampled about 100 of these error logs & concluded:
System Library Dependencies: The majority of failures stemmed from crates requiring system libraries that weren't available in our static build environment
Custom Build Systems: Many crates include build.rs files that fail when specified dependencies aren't met or when detecting system features during cross-compilation
build.rs files that: - Detect system features - Link against system libraries - Generate code based on target environment
Despite years of Rust ecosystem maturation, system library dependencies remain the primary obstacle to universal static compilation. This reinforces our strategy of targeting CLI tools that can be fully statically linked.
Crates vs Executables
Another interesting insight from building at scale: many crates produce multiple executables. The ~ 5,800 crates we attempted generated ~ 21,000 individual executables (Also referred to as binaries or packages)
🏗️ Build Pipeline by Executables ────────────────────────────────────────────────────────────────────────────────── 📦 Crates Built ██████████ 5,779 (100.0%) ⚙️ Total Executables ████████████████████████████████████████ 21,042 (364.0%) ──────────────────────────────────────────────────────────────────────────────────This 3.6:1 ratio reveals how rich the Rust CLI ecosystem actually is.
Native vs Cross
ℹ
This counts the executables generated & not individual crates. A single crate may generate multiple executables. (See Above)
The consistent success rates across architectures demonstrate Rust's excellent cross-platform story, though newer architectures like loongarch64 show slightly lower compatibility rates. This suggests that architecture-specific code assumptions remain common in the ecosystem.
An interesting anomaly: Despite building 5,779 crates successfully, x86_64-Linux only shows 5,627 executables. This discrepancy occurs because some crates successfully build for non-standard targets like loongarch64-Linux and riscv64-Linux but fail for standard architectures due to build hooks and scripts that trigger differently across targets.


You can explore detailed per-target build results here: CRATES_BUILT.json
CI Performance Metrics
Our primary build workflow (matrix_builds.yaml) handles the bulk of compilation, with additional workflows managing metadata and miscellaneous tasks. As we implement incremental builds (only rebuilding updated crates) and caching strategies, these metrics will improve significantly.

Average build time was ~ 2 minutes.

Compilation vs. Prebuilt Distribution
Compilation will always be slower than fetching prebuilt binaries, but the degree varies significantly based on crate complexity and dependency count. For our demonstration, we'll use fd-find as a representative example, though your experience may vary with more dependency-heavy crates.
Note: We're not measuring CPU, disk, memory, or bandwidth usage here—try it yourself to experience the full performance difference.
Cargo

Cargo Binstall/Quick Install
Cargo Binstall leverages prebuilt binaries, though it requires time for crate resolution: related issue

Cargo-Binstall and Cargo-Quickinstall are excellent tools that:
Integrate with cargo install workflow
Handle development dependencies and features
Target developers who want faster cargo install
Soar takes a different approach:
Distribution-focused: Static executables for end users
No development integration: Not meant for cargo workflows
Dependency-free: Zero system library requirements
Cross-distribution: Works on any *nix system (MUSL/GLIBC)
Soar

This project represents more than just a build farm; it's a proof of concept & also a reality check for the whole ecosystem.
Key Discoveries and Implications
The Rust CLI ecosystem is remarkably rich and diverse. Our 3.6:1 ratio of executables to crates reveals that the community is building comprehensive toolsuites rather than single-purpose utilities. This multiplier effect means that successfully building even a subset of available crates provides exponentially more value to end users.
Cross-compilation compatibility has room for improvement. While Rust's cross-platform story is generally excellent, our 42.4% failure rate highlights that system library dependencies and architecture-specific assumptions remain significant obstacles. This suggests opportunities for the community to develop more portable alternatives to system library bindings.
Static linking is both powerful and challenging. The ability to produce truly portable binaries that work across any Linux distribution without dependencies is transformative for CLI tool distribution. However, achieving this requires careful consideration of build flags, dependencies, and compilation strategies.
Broader Ecosystem Implications
Our work demonstrates that automated, large-scale binary distribution is not only feasible but can provide significant value to the developer community. The time savings alone—from nearly a minute of compilation time to under two seconds of download time—represent a meaningful improvement in developer productivity.
More importantly, this approach democratizes access to CLI tools. Users no longer need to have Rust installed, understand compilation flags, or debug dependency issues. They can simply install and use tools, lowering the barrier to entry for adopting Rust-based CLI utilities.
The pkgforge-cargo project will likely see these additions/improvements in the near future:
Automated updates: Rebuild crates when new versions are published (this is partially implemented)
Integration with Cargo: Maybe something similar to what `cargo binstall` does.
Build optimization: Optimize CI Build times & reduce Failures
Contribute Upstream: Opt-in system to automatically create GitHub issues with build logs when crate compilation fails, helping maintainers improve cross-compilation compatibility
Community Feedback: Listen to our users & the community to improve this project & hope for a widespread adoption beyond Soar.
As we continue to refine and expand this system, we're excited about its potential to influence how the broader software community thinks about binary distribution. The lessons learned here apply beyond Rust to any compiled language ecosystem, and we're eager to explore applications in Go, Zig, and other emerging systems languages. (Help us if you can)
The ultimate goal is to create a world where installing and using CLI tools is as simple as possible, regardless of the underlying programming language or system dependencies. This project represents a significant step toward that vision, and we're committed to continued innovation in this space.
We invite the community to engage with this work, contribute improvements, and help us build a more accessible and efficient software distribution ecosystem. Together, we can make powerful CLI tools available to everyone, everywhere, without the traditional barriers of compilation and dependency management.
Links:
Pkgforge-Cargo: https://github.com/pkgforge-cargo/builder
Pkgforge-Discord: https://discord.gg/djJUs48Zbu
.png)


