The Cost of Software Libraries

3 days ago 1
The Dependency Hell

So the other day, I had this argument on X that inspired me to write about the cost of using libraries.

By “libraries” I mean that thing most of us are addicted to using these days, where you go to your package manager, type npm install {something}, and {something} magically pops into your project, giving you “incredible” new functionality for basically zero cost.

But is it really zero cost?

Recently I shared a screenshot of my command-line API on X in response to a post criticizing an OOP pattern that created an entire class hierarchy just to parse the command-line.

Then this random guy re-shared the post with his comments:

It seemed to me like an open-minded comment from someone clearly ready to have an intellectual discussion, so I decided to engage.

And he did! He sent this:

Following up with the affirmation that “nothing else can beat it”. Strong statements, he must really know what he’s talking about!

I’ll spare you the rest of the posts, as no discussion actually took place, but it made me think about how a reasonable discussion about the pros and cons of using libraries would actually go.

To start, the HSVSphere guy kinda has a point. It’s kinda nice to just define macros on top of a struct and get type-safe arguments. I’d argue it’s “clean” as well, in the sense that you can look at this struct and see every argument your program supports.

#[derive(Parser, Clone)] enum Command { Build, } #[derive(Parser, Clone)] struct Args { #[clap(subcommand)] command: Option<Command>, #[clap(long)] debug: bool, #[clap(long)] release: bool, #[clap(long)] output: Option<String>, }

It’s definitely also nice to get a library that “just works” in exchange for a few seconds typing cargo add clap to install it.

Seems like a no-brainer, so why did I roll out my own command-line parsing code instead?

Well, because command-line parsing is trivial, and the costs of using a library to solve this problem far outweigh the benefits.

What are the costs you ask?

I’ll go over them below. But first, it’s worth pointing out what an engineer is responsible for when writing software. You might think it’s a long list, but it’s not.

A software engineer is responsible for the compiled executable, running on the target hardware.

That’s it.

Notice that there is nothing here about code. Code is the means through which we generate the executable—the actual useful thing that runs on the actual hardware doing actual work. You could be writing C, Rust, Python, or x86 Assembly; it doesn’t matter.

Therefore, an engineer worth their salary should be thinking not just about how to save time when writing a piece of code, but also about the hardware cost of running this code (energy spent, server cost, bandwidth cost, etc.) and its impact on the end user (runtime speed, stability, memory usage, etc.).

With this in mind, here are the costs of using clap for parsing command-line arguments

This is a Hello World program in C and its executable size:

#include <stdio.h> int main() { printf(”Hello, World!\n”); return 0; }time clang -O3 main.c -o main && ./main build --debug --output o.exe && ls -l main clang -O3 main.c -o main 0.08s user 0.04s system 122% cpu 0.100 total Hello, World! -rwxr-xr-x@ 1 gabrieldechichi staff 33432 Oct 20 20:06 main

And this is the equivalent Rust program and its executable size:

fn main() { println!(”Hello, world!”); }cargo run --release && ls -l target/release/cmdline_example Compiling cmdline_example v0.1.0 Finished `release` profile [optimized] target(s) in 0.18s Running `target/release/cmdline_example` Hello, world! -rwxr-xr-x@ 1 gabrieldechichi staff 468656 Oct 20 19:29 target/release/cmdli ne_example

Without going into why the Rust build is somehow 469 KB, the question is: how much of an increase in executable size is parsing the command line worth?

1 KB? 10 KB? 1000 KB? 1,000,000 KB?

This might sound like a stupid question at first, but there is a limit. If I told you my amazing command-line library was 1 GB in size, you would probably look elsewhere.

Let’s start with the C example. I’ve copy-pasted the cmd_line.c and memory.c implementations from my engine, with some small modifications. The files are attached for anyone interested in looking at the code behind the API.

#include <stdio.h> #include “memory.c” #include “cmd_line.c” int main(int argc, char *argv[]) { size_t buffer_size = 64 * 1024; u8 *buffer = calloc(1, buffer_size); ArenaAllocator arena = arena_from_buffer(buffer, buffer_size); CmdLineParser parser = cmdline_create(&arena); // Register cmds cmdline_add_command(&parser, “build”); cmdline_add_flag(&parser, “debug”); cmdline_add_flag(&parser, “release”); cmdline_add_option(&parser, “output”); if (!cmdline_parse(&parser, argc, argv)) { return 1; } if (cmdline_has_command(&parser, “build”)) { printf(”Building project”); if (cmdline_has_flag(&parser, “release”)) { printf(” in release mode”); } else if (cmdline_has_flag(&parser, “debug”)) { printf(” in debug mode”); } printf(”...\n”); const char *out = cmdline_get_option(&parser, “output”); if (out) { printf(”Output will be written to: %s\n”, out); } } return 0; }time clang -O3 main_cmdline.c -o main && ./main build --debug --output o.exe && ls -l main clang -O3 main_cmdline.c -o main 0.08s user 0.04s system 105% cpu 0.123 total Building project in debug mode... Output will be written to: o.exe -rwxr-xr-x@ 1 gabrieldechichi staff 35224 Oct 20 20:06 main

As you can see, my cmd_line.c implementation added 1.79 KB to the release executable. Honestly, this is more than I expected, but I guess it makes sense given I’m also adding the memory.c implementation and the standard library functions it depends on into the mix. (I really want to get rid of the standard library someday.)

What about clap? Glad you asked!

use clap::{Parser, Subcommand}; #[derive(Parser)] #[command()] struct Cli { #[command(subcommand)] command: Option<Commands>, } #[derive(Subcommand)] enum Commands { Build { #[arg(long)] debug: bool, #[arg(long)] release: bool, #[arg(long)] output: Option<String>, }, } fn main() { let cli = Cli::parse(); match &cli.command { Some(Commands::Build { debug, release, output }) => { print!(”Building project”); if *release { print!(” in release mode”); } else if *debug { print!(” in debug mode”); } println!(”...”); if let Some(out) = output { println!(”Output will be written to: {}”, out); } } None => { // No command provided } } }cargo run --release -- build --debug --output o.c && ls -l target/release/cmdline_example Compiling proc-macro2 v1.0.101 Compiling unicode-ident v1.0.19 Compiling quote v1.0.41 Compiling utf8parse v0.2.2 Compiling colorchoice v1.0.4 Compiling anstyle v1.0.13 Compiling anstyle-query v1.1.4 Compiling is_terminal_polyfill v1.70.1 Compiling clap_lex v0.7.6 Compiling heck v0.5.0 Compiling anstyle-parse v0.2.7 Compiling strsim v0.11.1 Compiling anstream v0.6.21 Compiling clap_builder v4.5.50 Compiling syn v2.0.107 Compiling clap_derive v4.5.49 Compiling clap v4.5.50 Compiling cmdline_example v0.1.0 (/Users/gabrieldechichi/d ev/src/github.com/gabrieldechichi/cgamedev-articles/251020-li brary-costs/rust) Finished `release` profile [optimized] target(s) in 3.22s Running `target/release/cmdline_example build --debug --output o.c` Building project in debug mode... Output will be written to: o.c -rwxr-xr-x@ 1 gabrieldechichi staff 995296 Oct 20 19:51 target/release/cmdline_example

A *whopping* 527 KB added to the executable—just to parse a couple of strings. We barely have a program yet, and we’re almost at 1 MB for the optimized build.

To give you a sense of how wasteful this is, the ROM size for Super Mario 64 is between 6 and 8 MB depending on the region. That’s for a complete 3D game, including assets, 18 levels, and at least 20 hours of gameplay. Mario reasons about the costs!

You might be thinking I’m micro-optimizing here. What is 500 KB at current download speeds?

Although this might be true (500 KB is significant for many applications), in practice it’s never one library, right? It’s usually dozens, each with its own dependencies, adding unnecessary bloat to the application. A few KB here and there, and soon your ChatGPT clone turns into a 250 MB executable.

If you think I’m making this up, just google “trending repositories javascript” and pick the first one on the list. On the day I’m writing this article it’s koodo-reader. Take a look at it’s package.json

(...) “dependencies”: { “@aws-sdk/client-s3”: “^3.485.0”, “adm-zip”: “^0.5.2”, “axios”: “^0.19.2”, “basic-ftp”: “^5.0.5”, “better-sqlite3”: “^11.6.0”, “buffer”: “^6.0.3”, “chardet”: “^2.0.0”, “copy-text-to-clipboard”: “^2.2.0”, “dompurify”: “^3.2.4”, “electron-is-dev”: “^1.1.0”, “electron-store”: “^8.0.1”, “fflate”: “^0.8.2”, “file-saver”: “^2.0.5”, “form-data”: “^4.0.2”, “fs-extra”: “^9.1.0”, “hammerjs”: “^2.0.8”, “howler”: “^2.2.3”, “js-untar”: “^2.0.0”, “jszip”: “^3.10.1”, “localforage”: “^1.10.0”, “mammoth”: “^1.8.0”, “marked”: “^15.0.11”, “megajs”: “1.3.9-next.17”, “mhtml2html”: “^3.0.0”, “node-machine-id”: “^1.1.12”, “qs”: “^6.11.2”, “rangy”: “1.3.0”, “react-hot-toast”: “^2.1.1”, “react-sortablejs”: “^6.1.4”, “react-tooltip”: “^5.28.0”, “sortablejs”: “^1.15.6”, “sse.js”: “^2.6.0”, “ssh2-sftp-client”: “^11.0.0”, “underscore”: “^1.13.7”, “uuid”: “^11.0.5”, “webdav”: “^5.7.1” }, “devDependencies”: { ...

35 dependencies, 41 dev dependencies. It’s yarn.lock file alone is 15 thousand lines of code. This is insane.

A common misconception I see often is the idea that

“if there is a library for it, it must be good”

As if every library writer were a rockstar programmer who put everything they had into that library.

In reality, the opposite is true. Most libraries, especially open-source libraries, are written by the average experienced programmer, with no concern for much of anything other than getting features in with the desired abstraction.

And even in the rare case where the library was written by a great programmer with careful attention to detail and performance, it’s most likely the case that the library writer had to support a wide range of use cases, which prevents them from making reasonable assumptions about the runtime—inevitably impacting the library’s performance and increasing its surface area for bugs.

The classic example I always use is malloc. When I say I write my own allocators the inevitable reaction I get from most people is: “do you think you can write a better malloc than the standard library?”. The answer is no, but also, I don’t have to.

I write narrow purpose allocators that work well for my code base and the problems that a game engine needs to solve, then I write code that is aware of these assumptions. The result is runtime performance many times faster than malloc, while still being flexible enough for what my software needs.

Let’s verify this in practice, again with the clap example. I’ve re-worked the code to run 10000 command line parsing iterations.

Here’s the C version

#include <stdio.h> #include <time.h> #include “memory.c” #include “cmd_line.c” #define ITERATIONS 10000 int main() { size_t buffer_size = 64 * 1024; u8 *buffer = calloc(1, buffer_size); ArenaAllocator arena = arena_from_buffer(buffer, buffer_size); char *argv[] = {”cmdline_example”, “build”, “--release”, “--output”, “./bin”}; int argc = 5; printf(”Running %d iterations of command line parsing...\n”, ITERATIONS); struct timespec start, end; clock_gettime(CLOCK_MONOTONIC, &start); for (int i = 0; i < ITERATIONS; i++) { arena_reset(&arena); CmdLineParser parser = cmdline_create(&arena); cmdline_add_command(&parser, “build”); cmdline_add_flag(&parser, “debug”); cmdline_add_flag(&parser, “release”); cmdline_add_option(&parser, “output”); if (!cmdline_parse(&parser, argc, argv)) { printf(”Parse failed on iteration %d\n”, i); return 1; } // Mark variables as volatile to prevent optimization volatile b32 has_build = cmdline_has_command(&parser, “build”); volatile b32 has_release = cmdline_has_flag(&parser, “release”); volatile const char *output = cmdline_get_option(&parser, “output”); } clock_gettime(CLOCK_MONOTONIC, &end); long seconds = end.tv_sec - start.tv_sec; long nanoseconds = end.tv_nsec - start.tv_nsec; double elapsed_ms = seconds * 1000.0 + nanoseconds / 1000000.0; double elapsed_us = elapsed_ms * 1000.0; printf(”Total time: %.6f ms\n”, elapsed_ms); printf(”Average time per iteration: %.3f µs\n”, elapsed_us / ITERATIONS); return 0; }

And the Rust version

use clap::{Parser, Subcommand}; use std::time::Instant; #[derive(Parser)] #[command()] struct Cli { #[command(subcommand)] command: Option<Commands>, } #[derive(Subcommand)] enum Commands { Build { #[arg(long)] debug: bool, #[arg(long)] release: bool, #[arg(long)] output: Option<String>, }, } fn main() { let args = vec![”cmdline_example”, “build”, “--release”, “--output”, “./bin”]; const ITERATIONS: usize = 10000; println!(”Running {} iterations of command line parsing...”, ITERATIONS); let start = Instant::now(); for _ in 0..ITERATIONS { let cli = Cli::parse_from(&args); match &cli.command { Some(Commands::Build { debug, release, output }) => { // Force evaluation to prevent optimization std::hint::black_box(debug); std::hint::black_box(release); std::hint::black_box(output); } None => {} } } let duration = start.elapsed(); println!(”Total time: {:?}”, duration); println!(”Average time per iteration: {:?}”, duration / ITERATIONS as u32); }

Before I show you the results, what is your guess for how big the speed difference will be?

Hey, do not peek!

Ok, here goes:

clang -O3 main_cmdline.c -o main && ./main build --debug --output o.exe Running 10000 iterations of command line parsing... Total time: 0.971000 ms Average time per iteration: 0.100 µscargo run --release -- build --debug --output o.c Finished `release` profile [optimized] target(s) in 0.01s Running `target/release/cmdline_example build --debug --output o.c` Running 10000 iterations of command line parsing... Total time: 52.841334ms Average time per iteration: 5.284µs

Turns out the Rust version with clap is **50x SLOWER**! 54.41 to be exact.

Rust usually runs close to C speeds for equivalent implementations, so the speed difference is likely all overhead from the library.

50x hit to performance. Just so you can have some magic macros for your command line interface. I guess I’ll be skipping this one.

Finally, maintainability. Another common misconception is that by building on top of libraries, especially libraries that lots of people use, your code will be more stable. After all, these libraries are battle-tested every day, what could go wrong?

Well for this Casey actually has an excellent thread outlining how software stability decreases as dependencies increases. The TLDR is: even if dependencies are stable, the more dependencies you have, the higher the changes your software will break. For 100 dependencies you software is basically guaranteed to break in less than 1 year.

However, I want to bring your attention to something else. Here’s a question for you to ponder:

If the library does break, what is the cost of fixing it?

This question is important. When developing software, one must think not just about the cost of writing the software, but the cost of maintaining it for its entire lifetime. That is, the cost of fixing bugs, adding features, supporting new use cases, and so on.

And one thing that any professional developer knows is that it’s much easier to work on your own code than on other people’s code. Even harder still is working on code outside your main project or organization, with a completely different style and different assumptions, i.e library code.

So even if the library is very stable, it’s often the case that if there’s a bug that affects you, it’s going to take a significant amount of time for you to fix it (or to beg the library writer to fix it). For simple libraries, this time can easily be longer than the time to write the functionality you needed in the first place.

So, I told you at the beginning of this article that I was going to outline the pros and cons of using libraries, and admittedly I focused a lot on the cons. This is by design, my bias is toward not using libraries unless it’s really necessary.

Personally, I only consider using a library in two situations:

  1. When the benefits the library provides far outweigh the costs of depending on it, or

  2. When there isn’t enough time to write a proper implementation and I have to make progress (usually due to external pressure)

Fitting the first situation are usually highly specialized, highly performant, and highly self-contained libraries for which I’d be hard pressed to implement something better myself. One example is cglm, which I use in my engine. Maybe it’s possible to write a faster and more ergonomic library for 3D math, I just don’t know how.

As for the second situation, what I usually do is I try to find the best library available, and then I make a task for myself to eventually replace it’s implementation. Currently I have two examples in my engine: sokol_gfx, which I use for desktop graphics, and clay, which I use for UI.

To be clear, both sokol and clay are extremely well written libraries. Both are low footprint, and highly performant. It would be totally fine to depend on them for the long haul. I just have a preference towards writing code in house.

This entire article can be summarized into one sentence:

“Libraries are not free.”

There are real costs, usually passed down to the user in the form of a bloated, slow application. But also paid developers every day, in the form of dependency management hell, slow iteration times, and obscure bugs that are incredibly hard to fix. All to save what is often times a couple of hours or days of initial investiment. I’m gonna take door B, Bob.

If you’d like to be notified of new articles, please consider subscribing.

Discussion about this post

Read Entire Article