Delimited Continuations in Lone Lisp

1 hour ago 2

Lone now supports delimited continuations!

It took me a long time but it's finally here: lone now supports one of the most powerful control mechanisms there is.

(import (lone print lambda control transfer) (math +)) (print (+ 1 (control (+ 98 (transfer 41)) (lambda (value continuation) (continuation 1)))))

Implementing this feature will pave the way to many others, such as exception handling and generators. However, before moving on with development, I thought it would be worthwhile to write down the story of this feature's implementation. I want to write the article I wish I had read before I began. If nothing else, this will serve as documentation for the future version of myself whose brain has deleted all this context.

Iteration

Lone was growing in complexity. I already had all these data types, all of these collections. I was having quite a lot of fun implementing these things. I was learning so much. Hash tables? Powerful stuff.

However, there was something I was secretly avoiding: iteration.

Well, that's not entirely true. I didn't completely avoid the issue. Inspired by Ruby, I added some each primitives to the intrinsic modules.

(import (lone lambda print) (vector each)) (each [10 20 30] (lambda (value) (print value)))

The way the each function works internally is it iterates over the contents of the vector and calls the provided function with each element as its sole argument.

LONE_LISP_PRIMITIVE(vector_each) { struct lone_lisp_value vector, f, entry; size_t i; LONE_LISP_VECTOR_FOR_EACH(entry, vector, i) { arguments = lone_lisp_list_build(lone, 1, &entry); lone_lisp_apply(lone, module, environment, f, arguments); } return lone_lisp_nil(); }

I cheated a little: I left the actual iteration up to the C compiler. That FOR_EACH macro simply evaluates to a good old for loop. The C code iterates on behalf of lone, applying the current element to the provided function.

So what is the issue? I mean, this works. This actually works just fine!

The problem is this made one of lone's limitations painfully obvious: lone lacked the ability to control the flow of the program. The only thing it knew how to do was call functions. That's why I had to implement iteration in terms of function calls.

I couldn't even begin to imagine how to implement something simple like a while primitive, to say nothing of magical stuff like generators. Clearly this was going to take a lot more effort than it initially seemed.

So I started reading all I could about iteration. The ergonomics of it. The design of the interfaces. I wanted to do The Right Thing right off the bat. I didn't really know what to look for, I just knew Ruby's iteration was nice and so lone's should be equally nice.

The very first result I found while searching was Bob Nystrom's article. I wasn't even surprised. I mean of course he has written about iteration. Two articles, even. First he taught me how to collect garbage, now he's going to teach me how to iterate properly.

These beautifully written articles demonstrate exactly why Ruby is nice. Ruby lets programmers write nice code that passes values to blocks, just like my each function. Whenever that fails to compose, concurrency comes to the rescue: Ruby somehow suspends the code in the middle of the iteration and yields control back to the caller, letting them resume whenever they want. This way, multiple iterative processes can be composed, interleaved or even interrupted. Good.

My sense of wonder quickly gave way to horror once I realized what was necessary to have such goodness. It turned out lone needed the one thing it did not have at the time: control over the call stack. Lone was a recursive tree-walking interpreter. The call stack, too, was managed by the C compiler.

I was going to have to bite the bullet. I was going to have to convert lone's recursive evaluator into a proper machine with registers and a stack.

Reifying the stack

I considered my options. I explored the code bases of popular programming language virtual machines. They all had bytecode virtual machines. How did they implement, say, generators? Copy the code, stack and instruction pointer into a callable heap allocated object. Makes sense, it's just that lone doesn't have an instruction pointer. I didn't want to transform the lisp code into bytecode. Is it really lisp if I get rid of the lists? I didn't think so.

How do I do this without transforming the code? I decided to ask around in Programming Language Design and Implementation communities. I asked this question on the Stack Exchange. I also asked about it on a Discord server. I was told about continuation passing style, yet another code transformation which I wanted to avoid... I really like those lists!

One particular reference kept popping up though: Structure and Interpretation of Computer Programs. SICP, for short. THE book of the Scheme programming language. One of the most classic books in the field. Science? Nah, we're more like wizards casting spells, computers are merely the runes upon which we inscribe our programs. As awesome as that is, the book is not an easy read, especially for someone like me who doesn't have a background in mathematics or engineering. I've tried to read this book a few times by now but I never made it through the entire thing. So imagine my embarrassment when I realized it contained the exact answer to my question all along!

Chapter 5.4, one of the very last chapters, describes the explicit-control evaluator, a register and stack machine which evaluates lisp expressions without converting them into something else first.

The explicit-control evaluator

So I read the chapter a few times to make sure I had gotten it down and once I was somewhat confident in my understanding of the machine it was describing I went ahead and translated the entire thing to C, modifying it as I went along.

Lone's evaluator did not have many special cases. Things like if are traditionally implemented as special cases in the evaluator, but since lone has FEXPRs I was able to factor them out into the intrinsic lone module. In lone, programmers import if as though it was some random function instead of a core part of the language.

The result was a machine that implements what I have come to believe to be the true essence of lisp: self-evaluating values, and function application. A lisp machine, if you will.

It starts with an arbitrary lisp expression and attempts to reduce it to a single value. If the expression is something like a number then it cannot be reduced any further. The machine just returns it. Easy.

expression_evaluation: switch (lone_lisp_type_of(machine->expression)) { case LONE_LISP_TYPE_NIL: case LONE_LISP_TYPE_FALSE: case LONE_LISP_TYPE_TRUE: case LONE_LISP_TYPE_INTEGER: machine->value = machine->expression;

If it's a symbol, the machine looks up the value of the symbol in the current environment instead. Also easy.

case LONE_LISP_TYPE_SYMBOL: machine->value = lone_lisp_table_get(lone, machine->environment, machine->expression);

Only when the machine runs into a list does it really start processing.

Lists represent function application in the form (f x y z ...). The first thing that needs to happen is evaluation of the function itself. It's probably a variable pointing to the actual function value. It could also be a lambda expression. Whatever it is, it must be evaluated. So the machine loops back into the expression evaluation logic, this time with f as the expression.

case LONE_LISP_TYPE_LIST: machine->expression = lone_lisp_list_first(machine->expression); lone_lisp_machine_push_step(lone, machine, LONE_LISP_MACHINE_STEP_EVALUATED_OPERATOR); goto expression_evaluation;

Once this is done, it's time to figure out what to do with the arguments. This depends on the function.

if (should_evaluate_operands(machine->applicable, machine->unevaluated)) { machine->step = LONE_LISP_MACHINE_STEP_OPERAND_EVALUATION; } else { machine->list = machine->unevaluated; machine->step = LONE_LISP_MACHINE_STEP_APPLICATION; }

Normal functions evaluate all arguments. In these cases the machine loops back and forth, evaluating each argument and accumulating the results in a list. It's this list that gets passed to the function in the end. FEXPRs just skip this step, the arguments are passed to the function unevaluated.

case LONE_LISP_MACHINE_STEP_OPERAND_EVALUATION: machine->expression = lone_lisp_list_first(machine->unevaluated); if (lone_lisp_list_has_rest(machine->unevaluated)) { lone_lisp_machine_push_step(lone, machine, LONE_LISP_MACHINE_STEP_OPERAND_ACCUMULATION); } else { lone_lisp_machine_push_step(lone, machine, LONE_LISP_MACHINE_STEP_LAST_OPERAND_ACCUMULATION); } goto expression_evaluation; case LONE_LISP_MACHINE_STEP_OPERAND_ACCUMULATION: lone_lisp_list_append(lone, &machine->list, &head, machine->value); machine->step = LONE_LISP_MACHINE_STEP_OPERAND_EVALUATION; case LONE_LISP_MACHINE_STEP_LAST_OPERAND_ACCUMULATION: machine->applicable = lone_lisp_machine_pop_value(lone, machine); lone_lisp_list_append(lone, &machine->list, &head, machine->value); machine->step = LONE_LISP_MACHINE_STEP_APPLICATION;

The next step is to actually apply the arguments to the function.

When it's a lisp function, a new environment is created where the variables in the function's list of arguments are bound to their values, thereby allowing the function to reference them. This environment also inherits from the function's closure, allowing it to reference variables that were live when it was defined. Pretty standard stuff. All that's left to do is evaluate the function's body.

switch (lone_lisp_heap_value_of(machine->applicable)->type) { case LONE_LISP_TYPE_FUNCTION: machine->environment = bind_arguments( lone, machine->environment, machine->applicable, machine->list ); machine->unevaluated = lone_lisp_heap_value_of(machine->applicable)->as.function.code; machine->step = LONE_LISP_MACHINE_STEP_SEQUENCE_EVALUATION;

Primitives are just C functions and some metadata. The machine just calls them. There is no way to automatically assign the lisp values to C variables, so the arguments list gets passed whole to the primitive. It gets to unpack the values however it wants.

It was at this point that I realized an interesting situation involving these C functions was developing: the primitives are going to want to call back into lisp. For example, if needs the machine to evaluate the condition and one of two expressions.

This doesn't sound so bad at first but it in fact spells doom for my ultimate goal of controlling the flow of the program:

The second (and deeper) implication is that if any intervening code does recurse through a foreign function, the resulting partial continuation cannot be reinstated. Andy Wingo, wingolog, 26 February 2010 8:39 PM

How can I possibly capture and manipulate the lisp stack when it's in fact interleaving with the C stack? I can't.

So after going through all this trouble to convert the recursive evaluator into a machine just to expose the lisp stack so that I could manipulate it, I end up in this sorry situation where lisp calls C which calls lisp again.

Clearly, the primitives cannot be allowed to recurse back into the machine... But how could this work? Many of them need to do this just to work at all!

I entertained the idea of just saving the entire C stack instead. I quickly gave up on this approach but not before I asked a question about it on Stack Exchange which produced a very interesting answer. So it was possible... Probably not wise but still. I always find it reassuring when I discover I'm not the first person who tried to do something that normal people clearly consider to be completely insane.

Integrating the primitives into the machine

Enough. No more of this C stack business. The primitives cannot be allowed to recurse back into lisp. That's the end of it. There's gotta be a way to solve this even with this constraint.

Inspiration came when I remembered this Stack Exchange thread I read while researching generators. Turns out generators are just perfectly normal functions that got mangled into state machines by the language.

struct fibState { a, b, position } int fib(fibState state) { switch (fibState.postion) { case 0: fibState.a, fibState.b = 1,2 while (a<100) { fibState.b, fibState.a = fibState.a, fibState.a+fibState.b fibState.position = 1; return fibState.a; case 1: } fibState.position = 2; return fibState.a-1 case 2: fibState.position = -1; } } mousetail, Programming Language Design and Implementation Stack Exchange, May 20, 2023 at 14:06

It's got an initial state and it transitions to new states as it progresses through its code, returning multiple values along the way. Could even be made to loop depending on how it's set up. Oddly similar to the lisp machine, now that I think about it. Could these two concepts work together?

The answer turned out to be a resounding yes! I rewrote all of lone's primitives to be state machines instead, just like the example in the answer to the question. The aforementioned each primitive, for example, which once upon a time was relatively simple function, turned into this monstrosity:

LONE_LISP_PRIMITIVE(vector_each) { struct lone_lisp_value arguments, vector, function, entry, expression; lone_lisp_integer i; switch (step) { case 0: i = 0; iteration: entry = lone_lisp_vector_get_value_at(vector, i); expression = lone_lisp_list_build(lone, 2, &function, &entry); lone->machine.step = LONE_LISP_MACHINE_STEP_EXPRESSION_EVALUATION; lone->machine.expression = expression; return 1; case 1: ++i; if (i < lone_lisp_vector_count(vector)) { goto iteration; } else { break; } } lone_lisp_machine_push_value(lone, machine, lone_lisp_nil()); return 0; }

The machine now passes to the primitive an integer that represents the current step in its program. The primitive in turn returns to the machine the next step in its program. Lisp arguments and return values are now passed on the lisp stack.

Other than the calling convention, pretty much nothing changed for simple primitives that do nothing special. The initial and final steps are both zero. They receive zero, do their thing and return zero. Done. They don't even need to deal with all this step stuff at all.

Special primitives, on the other hand, gain the ability to interface with the machine. Whenever the primitive wants the machine to do something such as evaluate an expression, it rigs machine so that it does it and returns a non-zero integer. This indicates to the machine that it should be resumed later at that exact point. The machine goes off and does whatever it is that it was asked to do and then it calls the primitive again to resume it at the designated step. This way, the lisp and C stacks do not ever interleave. The C stack is simply not allowed to build up. The C function returns instead of calling back into lisp.

Inversion of control. Don't call the machine, the machine will call you. Let it know what you need and it will get back to you when it's done.

Or maybe it won't! Maybe the machine will ghost the primitive and keep it waiting until the end of time for a value that will never come. When the primitives were simply calling evaluator functions, the C compiler guaranteed those functions would always return their results to them. That's no longer the case. The lisp machine is in control now.

case LONE_LISP_TYPE_PRIMITIVE: lone_lisp_machine_push_value(lone, machine, machine->list); machine->primitive.step = 0; resume_primitive: machine->primitive.step = lone_lisp_heap_value_of(machine->applicable)->as.primitive.function( lone, machine, machine->primitive.step ); if (machine->primitive.step) { lone_lisp_machine_save_primitive_step(lone, machine); lone_lisp_machine_push_value(lone, machine, machine->applicable); lone_lisp_machine_push_value(lone, machine, machine->environment); lone_lisp_machine_push_step(lone, machine, LONE_LISP_MACHINE_STEP_RESUME_PRIMITIVE); } else { machine->value = lone_lisp_machine_pop_value(lone, machine); } case LONE_LISP_MACHINE_STEP_RESUME_PRIMITIVE: machine->environment = lone_lisp_machine_pop_value(lone, machine); machine->applicable = lone_lisp_machine_pop_value(lone, machine); lone_lisp_machine_restore_primitive_step(lone, machine); goto resume_primitive;

This works and is surprisingly neat. Sure, all my primitives turned into weird state machines but it's not really a big deal. By now I've debugged this stuff so much I actually started liking it.

Manipulating the stack

What was the point of going through all this trouble again? Ah yes. I remember now. The stack. There it is. Reified. Just sitting there. Right in the middle of the structure. Literally one pointer away. Just waiting to be smashed and overflown all night long.

So what am I going to do with this thing? What can I do with it?

Can I return early from functions?

(import (lone set print lambda return)) (set f (lambda (x) (return 42) x)) (print (f 100))

To return, I need to find where on the stack the function starts. I need to put some kind of marker on the stack. A delimiter. It shall be pushed onto the stack before the function is called and popped off the stack when it has finished executing.

case LONE_LISP_TYPE_FUNCTION: lone_lisp_machine_push_function_delimiter(lone, machine); case LONE_LISP_TYPE_PRIMITIVE: lone_lisp_machine_push_function_delimiter(lone, machine); if (machine->primitive.step) { } else { machine->value = lone_lisp_machine_pop_value(lone, machine); goto after_application; } case LONE_LISP_MACHINE_STEP_AFTER_APPLICATION: after_application: lone_lisp_machine_pop_function_delimiter(lone, machine);

Now return can just unwind the stack until it finds this marker. To unwind the stack, simply pop off frames until you reach the frame you were looking for. Once the delimiter is on top of the stack, the stack discipline matches that of primitives. The return value can simply be pushed onto the stack.

LONE_LISP_PRIMITIVE(lone_return) { struct lone_lisp_machine_stack_frame frame; struct lone_lisp_value return_value; return_value = ; lone_lisp_machine_pop_function_delimiter(lone, machine); while (LONE_LISP_MACHINE_STACK_FRAME_TYPE_FUNCTION_DELIMITER != (frame = lone_lisp_machine_pop(lone, machine)).type); lone_lisp_machine_push(lone, machine, frame); lone_lisp_machine_push_value(lone, machine, return_value); return 0; }

So this is the power of the call stack...

Delimited continuations

So completely drunk on the power of the dark side was I, that I decided to go from the humble return primitive to one of the most powerful control mechanisms there is: delimited continuations. They say it is the one control to rule them all, one control to implement them, one control to bring them all, and in the stack unwind them.

But first I had to wrap my head around plain simple continuations. What even are these things? I knew of their existence because of my admittedly shallow knowledge of the Scheme programming language but I sure as hell didn't understand them. Wikipedia has the following example:

(define (f return) (return 2) 3) (f (lambda (x) x)) (call-with-current-continuation f)

It's not immediately apparent what's going on here. Let's unpack it step by step.

f takes a function, an applicable value really. Nothing special happens when a normal function is passed: it gets called, returns, and the program continues on.

Passing f to call-with-current-continuation, which by the way is commonly abbreviated as call/cc, causes it to be called with the current continuation, as the name implies. This current continuation is a super special callable value that, when called, somehow causes the call/cc call itself to return the value passed to it.

This really flips my bits. I simply have no idea how to use this. It sorta looks like my return primitive from earlier, but the function's gotta be threaded through call/cc?

The community once again comes to my rescue. A kind soul on Discord pointed me towards a wonderful presentation about delimited continuations. The keynote was given by Alexis King, the same person who answered one of my Stack Exchange questions I mentioned earlier. It's an incredibly insightful presentation that's worth watching in its entirety. It really does demystify the topic.

call/cc is briefly mentioned in the talk and it's explained how totally backwards and mind bending it is. It's sorta like exceptions but backwards, as if the catch was next to the throw? I think? I'm not even sure.

Let's just forget about call/cc. Plenty of reasons not to insist on this thing anyway. Continuations should be delimited. I have learned at least this much.

The link between delimited continuations and exception handling is another key point. It's the mother of all insights, the one idea that brings this mystical continuation business to the unwashed masses: delimited continuations are just resumable exceptions.

It's as if Python could do this:

try: print(10 + throw("error")) except error, continuation: continuation(10)

The throw breaks out of the try block and enters the except block, which is like a function with two arguments: the value thrown and the continuation at the moment it was thrown.

The exception handler code can just ignore the continuation and do nothing with it, which is what normally happens when exceptions are handled in pretty much every other language.

However, the callable continuation value allows another possibility: it lets the handler code to plug a value back into the code being tried, as though the throw primitive had returned it! So in this example, after throw breaks out of try and passes control to except, the exception handler code goes back into the try block with a new value, allowing it to finish as if it hadn't thrown an exception in the first place!

OK, that's not entirely accurate: calling the continuation doesn't actually go back in there. By the time the exception handler is in control, the try block is no more. The stack has been unwound and the code has reached a completely different function. What actually happens when the continuation is called is it brings over the try block to the call site.

try: except error, continuation: print(10 + 10)

It's as if the throw primitive captured all the pending computations at its call site and reified them into a callable value. When called, the value gets replaced with those exact computations but with the throw primitive replaced with some real value.

Let's go back to the lisp machine. It's got a stack onto which it pushes lots of data as it executes. The stack is not made up of just data, however. Machine steps are also pushed onto the stack. The stack is full of values which direct the machine to do things in a specific order. The stack is a form of code.

It's this code that forms the "current continuation". It's this code that's being captured.

Capturing the continuation

Another key insight in the keynote: continuations turn out to be just memcpying the stack back and forth.

The stack is just memory. We're going to need a buffer.

struct lone_lisp_continuation { size_t frame_count; struct lone_lisp_machine_stack_frame *frames; };

We're going to need delimiters too. The return primitive makes use of an implicit delimiter managed by the lisp machine itself. This general control primitive will manage the delimiter all by itself.

LONE_LISP_PRIMITIVE(lone_control) { struct lone_lisp_value arguments, body, handler; switch (step) { case 0: lone_lisp_machine_push_value(lone, machine, handler); lone_lisp_machine_push_continuation_delimiter(lone); machine->step = LONE_LISP_MACHINE_STEP_EXPRESSION_EVALUATION; machine->expression = body; return 1; case 1: lone_lisp_machine_pop_continuation_delimiter(lone); lone_lisp_machine_pop_value(lone, machine); lone_lisp_machine_push_value(lone, machine, machine->value); return 0; default: linux_exit(-1); } }

This primitive pushes the handler function and a continuation delimiter onto the stack. Then it evaluates the body of code. Once that's done, the primitive discards the handler function and the continuation delimiter and simply returns the result. So by itself it does nothing special. It's a glorified begin primitive.

Enter the transfer primitive.

LONE_LISP_PRIMITIVE(lone_transfer) { struct lone_lisp_machine_stack_frame *delimiter, *frames; struct lone_lisp_value arguments, value, continuation, handler; size_t frame_count; switch (step) { case 0: for (delimiter = lone->machine.stack.top - 1 - 2, frame_count = 0; delimiter >= machine->stack.base && delimiter->type != LONE_LISP_MACHINE_STACK_FRAME_TYPE_CONTINUATION_DELIMITER; --delimiter, ++frame_count); frames = lone_memory_array(lone->system, 0, frame_count, sizeof(*frames)); lone_memory_move(delimiter + 1, frames, frame_count * sizeof(*frames)); continuation = lone_lisp_continuation_create(lone, frame_count, frames); lone->machine.stack.top = delimiter; handler = lone_lisp_machine_pop_value(lone, machine); lone->machine.expression = lone_lisp_list_build(lone, 3, &handler, &value, &continuation); lone->machine.step = LONE_LISP_MACHINE_STEP_EXPRESSION_EVALUATION; return 1; case 1: lone_lisp_machine_push_value(lone, machine, lone->machine.value); return 0; default: linux_exit(-1); } }

The first thing it does is find the continuation delimiter. Then it copies all the stack frames between it and the top of the stack into a heap allocated buffer. Then it creates a lisp value that just holds this buffer. That's the continuation.

Then it unwinds the stack past the continuation delimiter, effectively popping off the entire segment of the stack that was just copied into the heap. It knows that just below the delimiter is the handler function so it pops it off the stack. At this point the stack discipline matches the initial state of the control primitive. We've really gone back in there! From the lisp machine's perspective, we're inside control right now, and if we return a value it will be as though control had returned it.

The primitive now has the value, the continuation and the handler function. There is but one thing left to do: evaluate (handler value continuation). The return value of that expression becomes the result of the control primitive.

I could stop here and call this an exceptions mechanism. If I deleted the continuation capture code, it would be just like the exceptions in every other language!

Making continuations callable

I'm not gonna stop there though. I'm so close!

My continuation value is just a structure that holds an array of stack frames. This is a new value type which must be properly handled in various parts of the system, especially in the garbage collector. Fail to mark and sweep these things and memory leaks will be the least of your problems.

The lisp machine must also be taught how to handle these objects. In particular, it must be taught how to apply a value to it.

Like functions, continuations evaluate to themselves.

case LONE_LISP_TYPE_CONTINUATION: machine->value = machine->expression;

Continuations don't fail the applicable type test.

case LONE_LISP_MACHINE_STEP_EVALUATED_OPERATOR: switch (lone_lisp_type_of(machine->applicable)) { case LONE_LISP_TYPE_CONTINUATION: break; if (should_evaluate_operands(machine->applicable, machine->unevaluated))

Continuations always have their arguments evaluated.

static bool should_evaluate_operands() { switch (lone_lisp_heap_value_of(applicable)->type) { case LONE_LISP_TYPE_CONTINUATION: return true;

And finally... When a value is applied to it, the machine spills the captured stack frames on top of the stack and sets the machine registers so that the argument flows into the computation.

case LONE_LISP_MACHINE_STEP_APPLICATION: switch (lone_lisp_heap_value_of(machine->applicable)->type) { case LONE_LISP_TYPE_CONTINUATION: if (lone_lisp_list_has_rest(machine->list)) { goto too_many_arguments; } lone_lisp_machine_push_frames( lone, lone_lisp_heap_value_of(machine->applicable)->as.continuation.frame_count, lone_lisp_heap_value_of(machine->applicable)->as.continuation.frames ); lone_lisp_machine_restore_step(lone, machine); machine->value = lone_lisp_list_first(machine->list);

The continuation is carefully captured so as to ensure the machine's next step is on top of the stack. The machine simply restores that value and off it goes. When it's done, the result flows naturally into the caller.

And just like that, lone has gained native first class delimited continuations!

Read Entire Article