
As a language, C has managed to do a remarkable job providing an incredibly useful middle ground between assembly languages and other systems languages… for about 60 years.
At its heart is a simple imperative language with accessible enough syntax. And while concepts like pointers are often a challenge for people coming into the language, if you are doing systems programming, you should have to understand them. Not to mention, C is a massive upgrade over dealing with the problem at the ASM level.
In this world, there are plenty of great things about C that have led to it being, in many ways, the most successful programming language ever, like it or not. It is entrenched at the bedrock of many bits of software used in other languages and environments.
There are also plenty of things that are painful about C. And I’m not talking about memory management; in many cases outside of kernel development, you could (and should) use a garbage collector by default (it’s easier than you’d expect).
Most of the things people find painful are the direct result of a language being so successful and important, coupled with it being well curated — the language has improved in many notable ways in the more than 25 years of shepherding through standardization, but done so in a way that ensures old, stable IMPORTANT code can be brought forward as the world changes.
There are many things about C that were fine 60 years ago, that nobody would do if starting from scratch today, most stemming from their philosophy, “Trust the programmer,” which has definitely NOT stood the test of time. Nonetheless, such things will ensure and do raise the bar to entry for C programmers, and can even cause problems for seasoned engineers. In some sense, those things are atrocities, because you wouldn’t have to deal with them in other languages (macros probably being the worst of them all, in my view).
But, while C changes, it does so in a way that makes sure not to break things that are in common use, and this is incredibly important. So the things that are atrocious should stay in the language, but enhancements and/or alternatives are on the table, and are very carefully considered.
C remains the only language in its class, and I do love it in a way that is not Stockholm syndrome. But I believe I have the perspective to know that every programming language, including the great ones, has many ways in which it is atrocious. I try to be even-keeled about it.
Of the many atrocities a C programmer deals with, one of the most unnecessary might be what passes for variable argument support in functions. This is one of the many things that is a relic of the era in which it was built — it was a decent abstraction over the raw assembly for the day, through a time where Pascal (its most prominent competitor for a while) couldn’t support declaring functions that took arrays as parameters, unless those arrays were fixed size.
But, from a developer’s point of view, C’s variable arguments approach is basically the same as it was before I was born; it feels primitive. We can do better by modern C developers, without breaking anything about past code.
While it would be nice to see some minor enhancements to the language, we actually can give ourselves a vastly better experience without even having to enhance C.
A stdargs API refresher
For those people used to sane systems for variable arguments, with a compact, intuitive syntax, you might not notice that all this is problematic if you’re taking a cursory glance at the language. You can declare variable arguments with a simple ellipsis:
Calling a varargs function isn’t even a problem:
Things go downhill fast when you actually want to access the variable arguments. First, you have to declare a va_list and set it up:
Ick. And what’s fmt doing in that call to va_start()? You’re expected to explicitly pass in the last fixed argument.
So when you decide you want to make a version that restricts the maximum length, and copy and paste the body of your code into the new prototype you chose:
Now your va_start is wrong. Your compiler may or may not complain. It may well compile, and when it does, it may or may not work. I’ve seen cases in a big system where the compiler didn’t blink, and while there were resulting problems, they were subtle enough that it was challenging to track down.
That’s actually the one thing C finally (somewhat) addressed in C23. You now no longer need to provide the second argument to va_start() (and in fact, if it is provided, C23 should ignore it, beyond a possible warning). That means, for the first time in C, you can have a function with no fixed parameters. 🎉
To access parameters via a va_list, the language providesva_arg(), a macro which takes two arguments, the va_list object, and the type name.
When you’re done with your va_list, you call va_end(), passing in the va_list again. Often this gets forgotten, but it’s undefined behavior if you don’t do it. Don’t worry, trust that the compiler does have your back. It’s not going to maliciously generate code that crashes because you forgot something. It more means, there are worlds where things might break in an unexpected way, like architectures where the stack doesn’t get properly cleaned up if you don’t call it. Omitting it may never be an issue in the real world, sure. But as a good C programmer, you want to do it anyway… just in case. So either it’s one more bit of ceremony that chips away at your time by the fraction of a second, or you live with the tinge of guilt that comes with skipping it.😊
Top varargs atrocities
Klunky syntax isn’t so much a problem; we can appreciate the C heritage. Implementation-dependent behavior is also part of the Luxe C experience — this is where the standards committee gets realistic about differences in multiple existing implementations, but not likely ever converging.
That history is at the root cause of many of C’s problems with variadic functions (varargs is just the colloquial term, but absolutely the one I’m going to use). These problems are all understandable, but they are problems, nonetheless. Here’s my personal laundry list:
- There’s no standard way to signify when variable arguments are done.
- There’s no type safety.
- You can’t (easily) build on top of varargs functions.
- You can’t reliably manually create or change a va_list (beyond copying one as-is). Let’s quickly look at each of these problems.
Problem 1: The arguments may never stop 😢
Typically, people use a null pointer to signify the end of arguments. It’s common to wrap variadic functions with a macro in an attempt to remove the burden from the developer as much as possible. Let’s imagine an interface,char *my_join(char *joiner, ...) function that takes a variable number of char * arguments, using the provided joiner.
Often, the actual function is hidden, wrapping it in a way that ensures a null pointer is automatically added at the end.
So you might see:
Those two macros together will rewrite:
to:
The first macro, my_va(), does the work of auto-adding the null pointer as a terminator. It can be defined once and then used globally.
Instead of asking the end user to remember to call my_va() or manually add the null pointer, the macro transparently takes care of it.
We’ll look at the my_va() macro in more detail, but before that, let’s quickly examine the second macro, because it’s more the end programmer’s interface in many systems. It’s the thing that saves the end user of the API from having to understand the implementation details and live with the burden of always remembering to terminate the argument list; my_join() here removes the varargs burden from the end user of the API; my_va() is just the simple veneer that removes toil from whoever is building that API.
Some person using my_join() as an API call may expect that this is an actual function. But it’s a macro, meant to be an incredibly thin wrapper around the actual function (a style issue; if you don’t need ABI compatibility, I’d rather not burden people with details of an abstraction if it’s distracting).
And the wrapper is very thin. All it does is,
Replace the user’s invocation with an invocation of the actual function, which we’ve renamed by adding a leading underscore to _my_join(). This is akin to what you would do in Python to signify, “this is really supposed to be hidden, welcome to the implementation detail."
Copies in the arguments passed into the macro, but makes sure the null pointer is appended. That’s done by passing into our my_va() macro the builtin __VA_ARGS__, which expands to the text input into it at the call site.
And this will work, even when there are no variable arguments to pass. In that case, the call still gets anullptr, so will see that the beginning IS the end.
One little caveat is that the static text at the call site may have macros that get expanded before our macros get expanded. Bad interactions between mostly orthogonal macros are one of the things that make the preprocessor its own nightmare. But let’s not look under that hood, as it’s not clean or pretty, and it’s a tangent.
Suffice it to say, the most defensive thing we can reasonably do is put macro parameters that we use in the body inside parentheses.
Yes, you use the ellipsis to declare varargs parameters both in macros and in functions. Accessing those parameters in a macro is done via __VA_ARGS__, which essentially will be replaced with the fully expanded text inserted in its place, whatever text that turns out to be, commas and all.
Note here that the wrapper macro we wrote always adds its own comma after the last fixed argument. While the my_va() call can take care of the comma, it confuses developers trying to write a varargs function to ask them to use function call syntax, except omit the comma.
As a result, the second macro cannot always add a comma. _my_join(joiner,,nullptr) is not valid C, so we can’t have the preprocessor generate it. That leads us to the second macro, where we need to add the arguments in, as well as the null pointer. And if there were any variable arguments passed, we should add a comma separating the last passed argument from that null pointer. But with no arguments passed, we want to skip that and use the existing comma.
That’s what __VA_OPT__() is doing for us. It’s another special built-in that is replaced by (the macro expansion of) what’s inside the parentheses, but ONLY if one or more variable arguments gets passed. Otherwise, it gets replaced with the empty string.
Now, the implementation of _my_join() does still need to check the output to va_arg() and stop when it sees a null pointer.
Unfortunately, this whole approach won’t work for us if we ever want to accept 0 or nullptr as a valid value. If it’s possible that str2 could be nullptr itself, perhaps even intentionally, as it’s not too uncommon to do that to represent the empty string (the somewhat better option when using C strings is to instead pass a pointer to a single byte containing the value 0). If that’s the case, we’re going to end up returning just str1, and completely ignore poor old str3.
We also could never use this approach for writing a function to sum a list of numbers, because we can’t stop summing, just because someone input the number 0. A “better” general-purpose approach is to always pass the number of variable arguments explicitly into the function, so that it knows how many arguments are available. But that would be incredibly error-prone if it relied on developers to do it manually.
So ideally, we would build that into our transparent wrapper. We would want something like:
Sadly, there is no builtin named __VA_COUNT__. So you’d have to build it yourself. Call it something likeMY_VA_COUNT().
🤓 | While C does not have a builtin namespace feature, always manually use a single namespace for everything in C please; you don’t want problems down the road when there are clashing symbol names), and you would then instead call MY_VA_COUNT(__VA_ARGS__) instead of our desired __VA_COUNT__. |
The clearest, most robust code I can imagine to implement such a macro is still quite ugly due to C’s macro rules. I’ll cover that too, in my next article, which should be out next week.
Still, if you’re the one designing an API call, you should always take this approach, as a bare minimum. Hold your nose, use macros to automate passing in the number of variable arguments that are coming, and then make sure you pull exactly that number of arguments (or fewer).
Of course, for people coming from other languages, the need for macros for basic checking may seem absurd. You may wonder, why couldn’t some C standard along the way have at least specified an interface for finding how many arguments remain to be accessed in a va_list?
Either more important problems (they’ve clearly tackled some important items), or maybe nobody thought to do it. 🤷♂️
Problem 2: It’s not safe here
Being a product of its age, C has a notoriously weak type system by modern standards. The type system is at its worst when it comes to variable arguments. You shouldn’t expect any static type checking for these arguments whatsoever, unless you’re calling a varargs function already in the C standard library, like *printf(). There, most compilers bend over backward to check call sites, because they know the semantics of the backend implementation. You’re not going to be so lucky anywhere else.
Unfortunately, C doesn’t give you any dynamic type checking either. With your call to va_arg(), you specify the type you’re expecting, but this isn’t used for any runtime checking at all (compilers might use it to check call sites in-module, but probably not, and don’t depend on it).
Instead, the type information is used statically to tell the underlying implementation how many bytes to grab.
You might think C doesn’t need type information to pull from the variable arguments. Generally, you’d expect them to all be stored contiguously, probably on the stack. On 64-bit architectures, the underlying ABIs (Application Binary Interfaces) generally require promoting all arguments to 64 bits, whether they’re passed in registers or not. But that is ABI dependent, and if you need to worry about 32-bit platforms, the ABIs are much more diverse. The C standard only demands that variable arguments be promoted to a minimum of 32 bits (yes, promotion happens even for Booleans).
Even on 64-bit platforms, the type of data being passed can matter, depending on the ABI. For instance, for the C23 __int128 type, some ABIs require that this value is always passed by reference (meaning, a pointer to the value gets passed, not the value itself). Other ABIs will pass the whole thing. For instance, the AMD64 System V ABI will pass this as two 64-bit values, but Arm’s AArch64 ABI passes a pointer for any value larger than 64 bits.
So, when there are large arguments or you’re on old hardware, the underlying varargs implementation may use the hint. But often, it’s effectively useless, and certainly is useless to you, in terms of ensuring correctness. What’s a poor API designer to do? A couple of options:
For APIs where all arguments should be of the same type, you can use C’s new-ish _Generic selection feature, which is a kind of compile-time switch() statement, where you switch on the value type (which in other languages is sometimes called a typecase statement). This is actually tough to get right because the semantics aren’t as easy or flexible as a casual user might expect. We won’t cover this today (you can look in the code that accompanies this article if you like), but suffice to say, it’s workable, and once you’ve built it well, you can hide it behind a macro.
If you are building something like printf() where the types cannot be homogeneous, you can usually build a system in such a way that you can safely test memory, and look up a runtime type, if available. But this is a lot of work, and can get expensive, depending on your needs (and we won’t cover this today, either).
In many languages, you’d never need the second option, so static type safety would suffice.
But those languages are likely to have some sort of keyword argument capability, which C absolutely does not have.
ℹ️ | _Generic is an example of C working hard to help *enable* better static type checking, while living in the constraints of 60 years of legacy. It’s a great addition to C, even if it’s the only language that should ever have this feature (Anything newer would do better to start with a more general-purpose compile-time parametric polymorphism baked into the type system). |
While code misinterpreting values is a real problem, ABI type problems aren’t too much of an issue in the context of C— if you are running on a 64-bit platform. On such platforms, while the following code is wrong, it will probably work out of the box on most such platforms:
It’s wrong because people can easily pass small integers to the function. But it’s unlikely to break, since every 64-bit ABI I’ve ever used will widen all smaller parameters to 64 bits.
If you’re on a 32-bit platform, it’s very likely to break. For instance, consider a call like this one:
Even if current is a long long and guaranteed to be at least 64 bits, the integer literals are typed as int at the call site. The size of int is implementation-dependent, but sadly, it is almost always 32 bits today.
Meaning, when you ask for your six numbers, the underlying varargs implementation will probably treat pairs of numbers as a single 64-bit long long int. Not only will you get three bad numbers, you’ll end up processing another 192 bits of garbage, as you pull arguments up to the 6 that the count argument will (rightly) know to expect.
Problem 3: You can’t easily get the needed leverage
Here’s a common problem in the life of a C programmer. Let’s say you want to build a logging API to automatically add timestamps, send logs to the right place, etc. The naive C programmer thinks, “I can give people the same flexibility as printf() by just wrapping fprintf() to do what I need it to do!”. Here’s your prototype:
The idea being, you’d construct a header, and supply the output file all transparently, and then pass the user’s actual format string and arguments to something like fprintf(), or snprintf().
The problem is, there’s no direct way to pass the parameters of one varargs function to another. You would not be able to call fprintf(), snprintf(), or even printf() from your my_log() implementation.
There’s a kludgy workaround in the language itself. You can pass a va_list between functions. So if you start your implementation with the boilerplate:
You may now pass the variable args as a single, fixed parameter.
You still can’t call any of the functions you want to call, but that’s okay, because for every varargs *printf() call in the standard library, there’s a fixed-argument analog starting with the letter v. So you can, instead, call vfprintf(), vsnprintf(), and vprintf().
That’s pretty weak. And if you have the foresight to think, people want to build more abstraction on top of your log API (for instance, to do custom logging per topic), you will also need to produce two versions of your own API call.
Of course, that’s not too big a deal. Typically, variadic functions are implemented as a minimal wrapper on top of their va_list counterpart.
For instance, I suspect the following code is similar enough to almost every standard library implementation of fprintf(), that you might think I plagiarized it, though I did not:
No, this approach is absolutely formulaic.
So why does C make you go through all the pain? 🤦
Oh, right, legacy. Okay, good reason.
Problem 4: You can’t edit your arguments
In our previous example, what if we want to implement our function by combining two format strings and passing in some new arguments?
For instance, you might just want to dynamically concatenate the string "[%s @%s]: " to the front of the user’s format string and add arguments to the front of the variable argument list, one for the topic, and one for the time. You’ll then want to call vfprintf() under the hood to do the heavy lifting for you.
Nice idea, but you would need a way to build a new va_list to pass to vfprintf() . The C standard gives you no way to do that out of the box. 😩
Or, let’s say you want to call a version of the function my_join() from above, like perhaps a my_vjoin(). But, you want to print all of the arguments passed into your main() function’s argv parameter.
Keep dreaming — there’s no good way to go from an array to a va_list.
I’ve seen people try to create a my_to_va_list(...) function that returns a va_list object (and have been that person). If that’s passed on the stack, that’s a clear, huge mistake that might seem to work in some scenarios.
I’ve seen more intricate attempts at such hackery that will try to dynamically allocate a return value, build a static va_list and copy data into it.
Whether that works or not varies depending on the exact standard C library implementation you’re using. It absolutely does not work portably everywhere, and you’d be lucky to get it to work reliably in most environments.
And if you can get it to work because you plumbed the implementation details, you then have to worry about doing all that work for every platform you want to support.
And while compilers usually offer a built-in implementation that a library writer can transparently use, you might also have differences across libc versions on the same piece of kit.
Unfortunately, the C standard allows the library implementer so much flexibility that, even if you somehow found it easy to locate details about each implementation, it’s untenable to support the universe by keeping up with each varargs implementation.
And that’s all before you think about the different platforms those libraries support, where they might change implementation strategy based on the hardware environment.
Implementations do vary greatly.
The va_list type used to typically look like this:
Now, it will more commonly look like this:
If you go looking in your friendly libc implementation, you may well not directly see either approach, because this is an area where the implementation can just let the compiler take care of it. Both Clang and GCC are willing to do that, and have an opaque builtin type called __builtin_va_list. Using it tells the compiler to just handle all of the details itself. That’s definitely how MUSL declares va_list, for example.
The char * option has its roots in the history of C, where the targeted hardware wasn’t typically going to pass parameters via registers. It was always going to be on the stack. That made va_arg() pretty easy to implement. To implement va_arg(), you could do something like (using a more modern atomic call for simplicity and clarity):
Here, the atomic_fetch_add() function will add the given number of bytes to the value of ap, but returns ap’s value BEFORE we add to it.
This approach is now uncommon because they are going to align to the ABI anyway, so they will want to reuse all that tracking, instead of managing it themselves.
In modern ABIs, it’s typical (but not universal) to treat variable arguments the same way as normal arguments. In most cases, the first n arguments will get shoved into registers, and then the rest will probably be passed on the stack.
That’s why the more modern va_list structure is so much more involved than a simple pointer. It needs a lot more state to treat those arguments like every other. For fixed parameter lists, it can determine how to pass things at compile time, but for variable arguments, it may not always be able to do it. Especially if you’re committed to passing the arguments to the underlying ABI as if there’s no language special feature… they’re all just parameters.
Perhaps it’s more efficient, but even if so, it is unlikely to make much of a difference than alternative approaches. And it has the downside that our arguments are probably not going to live in a nice, tidy array of 64-bit values.
The net here is, given the history of the current C variadic functions API, and its widespread use, there’s not a realistic path to an interface that gives full access to each argument, in a portable way, without breaking anything currently in production at the ABI level.
C does give you va_copy() to allow you to make guaranteed copies of the state of a va_list object, so that you can iterate over it multiple times. But that’s almost always going to leave your copy on the stack, and still leaves the locations of your arguments transparent.
So if you want a function to construct a new va_list, it’s problematic, especially if you want to add, remove, or change items in your copy (never mind modify in place).
Imagine you dynamically test memory, and want to leverage *printf() so that, any time there’s a %s specifier, you look at the associated parameter. If you find a C string, then sure, you pass it through. But, if you find one of your special object types, you’ll call an object-specific function to show for that object, and substitute that in to what you pass to *printf().
It’s a reasonable thing to want to do, yet the stdarg API doesn’t let you do it.
Practical Coping Strategies
For the first two problems, we’ve already mentioned common techniques to work around existing issues that are worth using, even if they do rely on hiding ugliness through the added ugliness of macros (the specific macros, we will cover next week).
What else can we do, though?
🐢🐢🐢 …all the way down 🐢🐢🐢
One common approach is to add a third layer of APIs for your variadic functions, one that takes a flat array. In this world, the version of your call taking a va_list would unload the entire list into a single consecutive array of 64-bit items. And, it keeps track of the length.
If we do this right, we are actually at the bottom layer of turtles. I recommend the following structure:
Here, nargs is meant to be fixed at allocation time, and is the number of items that get allocated for args field. The cur_ix field keeps the index of the next argument from args to yield.
Now, you have all the flexibility you might need. It’s pretty simple from that object to construct a simple yet powerful API, like:
Here, we’ve given ourselves the option to get the next item by value or get a reference to it. You can swap out individual items if you have reason to. It’s also easy to merge two of these lists.
If you need to iterate through the list multiple times, you don’t have to duplicate anything in advance; you can just reset the original context. And, if you need to peel off the first couple of args, then, depending on their values, pass down the rest to a helper; you can absolutely do that without problems.
The above functions that return bool return true if the argument was present, or false if you’re out of arguments (or if va is a null pointer to begin with). The actual value requested will, when the requested argument is not out of bounds, get stored at the memory address given in outp.
These functions are not too challenging to implement. However, one good thing about va_list is that any allocation and deallocation needs at the call site are automatically handled at the call site. We can handle that need, if we sacrifice our dignity, by reaching for a macro layer.
For instance, you might start by defining a couple of functions:
Those are pretty straightforward to build, and once you get a h4x0r_vargs_t, you are on the C version of easy street (which is still difficult; it is a systems language after all).
🎵 | When you want to have all variable arguments be one fixed type, it’s possible to automate type checking on variable arguments at the call site. It does require a bunch of macro ugliness, such as being able to iterate over the arguments at compile-time. So it’s ugly and hard. But in my article that I’ll publish next week, I’ll show the basic approach for doing that kind of iteration. |
🐢 One turtle only
If you’re adding such an array layer, you might want to consider nuking the layers of wrappers the language pushed on you.
You can do so; it simplifies things and makes it more palatable to use varargs when appropriate, instead of kludging your way through it every time.
Instead of relying on the user to call a function that dynamically allocates, you can automate putting the structures on the stack. That has the benefit of automating the clean-up.
In most cases, it’s going to be more maintainable to have your API separate out allocation from populating the structures. For instance, you could use one of the above two functions to dynamically allocate, and then build functions like those below to initialize:
Both functions would take a pre-allocated structure (stack, static, or heap, we don’t care), along with a count of the number of args. They’d require that the args field of the h4x0r_varg_t struct passed in points to memory that is sized based on the corresponding count.
As declared here, the first would be a true C variadic function, ideally the only one in our entire code base. This function simply needs to unload each argument up to the count into the allocated args array, and then initialize the two numeric fields (the nargs field would get ct; the cur_ix field will get 0).
Both these functions would return their first argument, so that you can hide the allocation logic behind macros allocation, like so:
Here, I used the standard library function alloca() to instantiate the array element of the structure. That call stack-allocates the given number of bytes, returning a pointer to the start. This approach is not the best choice for h4x0r_varargs(); we should instead be using with C’s new variable length arrays, where possible.
Unfortunately, this is not currently possible when populating from an array or a va_list, because we would need to use C’s newish compound literals to initialize the array, and it’s not currently possible to initialize a variable-length array using a compound literal.
Getting the compound literal right requires some sophistication, too. The biggest problem is that we would need a way to statically transform each list item to ensure it’s of the right size to fit into the underlying list. It may also be necessary to silence type errors. While you can store a lot of things in an array declared to hold objects of type void *, there are plenty of things that should fit that aren’t as simple as an assignment. Small integer types won’t convert, only ones that are the same size as void * will. For those, you can first convert them to the equivalent 64 bit representation.
But while floats and doubles should fit into a 64-bit void *, to get them there, you have to jump through hoops involving a union.
Handling such special cases, and doing it all in a way that you can make transparent to the user is not easy, but it possible (in fact, we do it for you in the code accompanying this article).
Still, the fact that automatic stack allocation via macro ensures that the allocation will be automatically cleaned up when the calling context exits, just like a va_list would be. Removing sources of memory management errors makes this a great default, particularly if you’re doing things to automate away other sources of array errors, like using a good macro to automatically, and safely, count the number of arguments, in addition to having the accessor API explicitly check for all bounds errors.
On the down side, stack allocation itself cannot be moved into a function; the call has to either be explicit or in a macro, because taking it out of a macro would mean the value we return would already be conceptually ‘freed’ by the time the call exits.
While this does complicate matters, it is still reasonable to build. Yet, by default, it lets us transparently allocate variable argument lists when we need them, without having to manage lifetimes when we shouldn’t have to think about it. And if the automatic lifetime doesn’t suit our needs, we are in no way tethered to it!
For example, if you are likely to pass the same things over and over, you can use static memory, or something long-term that’s heap allocated; you can use your dynamic allocation API.
To make things easy all around, especially when the user wants to skip all variable arguments, I’d still suggest wrapping every varargs function with a macro, at the very least for the typical case where syntactically at the call site, the user should feel like they’re calling a true variable argument function. For example, we might declare our my_join() function as such:
At that point, most invocations would look pretty simple, for example:
Plus, if you already have a reference to a h4x0r_vargs_t object, you can just pass it on directly.
Or, skip all the hard work
I’ve built an open source reference implementation variadic function API for C. It is similar to the above, but not quite identical, providing more flexibility. For instance, it provides a simple way to do full compile-time checking at vargs call sites, in cases where all variadic parameters are expected to be of a known type.
This all may feel like a big lift if you’ve got a legacy code base. Though, as one data point, I migrated a code base with over 100,000 lines of code, which makes extensive use of variable arguments, to this approach in about a workday.
The advantages are significant.
Here, we’ve mostly circumvented C’s built-in variable argument capabilities. But we’ve done it in a way that allows for a simple API, while keeping the implementation strategy consistent across all platforms. We’re bundling up everything the user may think is a variable argument via macro, and passing them via a pointer to a single object of type h4x0r_vargs_t. That one pointer? Sure, it gets passed; however, the system wants to pass any other pointer. But that keeps it super simple at the ABI level. And from there, we have total control and flexibility.
Except for one thing…
Leveraging 3rd party varargs functions
It’s true, the C language itself provides no portable way to go from an array of any kind, and call printf(), or similar. But there is a portable third-party library that can make it possible, libffi.
The library provides you with an API where you can programmatically call any loaded function you can reference, while dynamically constructing the arguments to the call. They take the burden of implementing all the semantics of every ABI to give the rest of us a fairly simple, single API.
Instead of reimplementing complex functionality like C’s format strings yourself, this allows you to simply write a little wrapper for each C function you might want to use.
The wrappers are pretty formulaic and only need to be written once. Without providing a full tutorial for the library, I’ll give you an example and hit some highlights. But, the companion code pre-wraps all the variadic calls from <stdio.h> for you, and does not require the libffi header (though it does require the library be present at link time).
Here’s an example wrapper for sprintf() that allows you to pass a pointer to a h4x0r_varg_t object:
The abbreviation CIF in their API just stands for “Call InterFace”. ffi_prep_cif_var loads everything the library needs to perform the call, including the number of fixed and total arguments (some ABIs do treat varargs specially), the return type, and so on.
All the previous function does is construct arrays to tell the library about the call we’re trying to invoke, and the arguments we want to invoke it with, and then invokes a function to perform the actual call.
The main oddity is that all the arrays we populate for it take pointers to the values we want, never copies of the actual values.
Importantly, though, in this example, I did cheese one thing. For every variadic argument we want to pass, we’re assuming:
- The underlying ABI is going to go ahead and automatically convert everything to 64 bits.
- We’re not going to pass parameters larger than 128 bits.
As such, we just list every variable argument as being a “pointer”, which is not strictly true. But libffi is just using that data to figure out how to match your input to the ABI, so everything will work smoothly.
Also note that, if you will call these functions often, you can do some caching. Each thread could hang on to a CIF object per call; the only thing that needs to change in between invocations is the number of total arguments. Crucially, we do NOT pass the function itself, or the actual arguments when prepping the CIF object.
Instead, we pass those, along with a pointer to the CIF object, and a pointer to where we want the result to live, all to ffi_call().
Where C2Y should go
C is approaching 60 years old; even since the first standard (C89), over 35 years ago, change has been conservative. That’s all for a good reason, given how it’s at the core of so much of the software that runs the world.
But, conservative change in C does happen, when it’s clearly valuable. It’s just that it’s always done, with careful consideration given not just to the benefits, but also to “what might break”.
Personally, I think the bare minimum changes would be:
The existing <stdarg.h> API should get an additional call, va_count() which would return the number of remaining args in a va_list. That can be accommodated in the va_list opaque type without having to change anything about the ABI level interaction, and is a big win for anyone who’s going to keep using the traditional API.
A new predefined macro, __VA_COUNT__, which returns the number of variable arguments passed to a variadic macro, because it’s really painful to build due to the way C handles recursion in the preprocessor (actively doing its best to thwart it). Again, I’ll cover this in a follow-up article, which I’ll publish next week.
The addition of va_count() would force compilers and libraries to explicitly track the number of parameters to comply with the standard, which is a good thing.
Better still would be adding a new varargs implementation that is explicit about the representation and calling convention, not just the API. Being explicit here is better; the reason why va_listis opaque is because there was no practical alternative option in the face of multiple prominent implementations with different approaches.
A second interface can enable full language support for typing and composability, and being able to mutate lists when valuable. The type and API I sketched out above could be named using the stdc_ prefix with no problem. The need to use a macro at call time should go away completely.
That begs the question of what to do with the declaration syntax, though. I would recommend:
A bare ellipsis ... would implicitly use va_list (though perhaps this behavior could potentially be configurable via a preprocessor define). Such calls would be unchanged (fully ABI compatible) with today’s semantics.
If you add a type in front of the ellipsis, it would select this approach, with static type checking for all variadic arguments at the call site.
Additionally, you could specify _Varg (or any other sane name) to enable the new approach, without static checking (and optionally elide the ellipsis). Instead, any parameters would automatically get promoted to 64 bits, and larger parameters would be passed as a series of 64-bit parameters.
In terms of mapping to the underlying ABI, the new structure will always be passed as a single pointer, following the ABI’s calling convention for fixed arguments.
By default, these structures would automatically be placed on the stack, but it should be straightforward to provide a pointer to an object anywhere in memory to use for the context. This could perhaps be done via a macro like varg_call(object, join(arg, a, b, c)). Or, perhaps a call-site annotation.
Similarly, there should be a built-in mechanism to pass an array and length via copying in the contents. Something easier and clearer than: varg_array_call(arr, len, join(arg, a, b, c))
Even for those not using the new API, it would be a huge benefit to have conforming stdlib implementations provide a portable way to get an array into a va_list.
Similarly, there should be a simple API-based approach for calling a <stdargs.h> style variadic function, but passing it an array, without having to use a more general-purpose third-party library like libffi.
The proposed alternate strategy could be considered breaking the ABI for variable arguments. But the ABI is about binary-level compatibility across compilation units. The proposal would be to follow the ABI, but to have the high-level language feature in some sense feel like varargs for all practical purposes, even though at the ABI level, we would be passing only a single pointer to represent all our state.
That’s not even remotely problematic, as long as the approach is well specified and consistent. If a Rust program wants to call into a varargs function using this approach, they absolutely can, and when they see the type signature for the function, the work they have to do to pass variable arguments should be absolutely clear from the fact that the last argument is a pointer to our variable argument structure.
Yes, if you only have an ABI level view, and see the last argument, you have no idea if it’s intended to be some sort of proxy function that’s passed in a full object (like va_args) or a function where the system is expected to transparently bundle things that way.
But, that shouldn’t matter at the ABI level. Those two things should be indistinguishable. This is only a challenge for the C-level semantics, because language implementers will have to make different implementation choices based on the semantics.
Particularly, if a function is declared with an ellipsis, what should happen when someone passes in a raw varargs object?
If I wasn’t explicit enough on how I think this should be handled, the type passed should not be special-cased. If you want to reuse varargs across functions, either go the old-school route of explicitly declaring one to only take an object, OR use one of the explicit API calls to denote what you want to do.
It’s not a problem… as long as the specification is concrete enough.
I’m happy to write up a more detailed spec if the committee comes calling…
I will be drafting a proposal; ideally, the committee’s knowledge and feedback will lead to something even better in a future version of the standard.
Reference code
Acknowledgements
My gratitude to Robert Seacord and Ivan O’Day for excellent feedback and discussion on this post.