Complex Iterators Are Slow

3 months ago 2

Up: Programming notes

Thursday, 31 July 2025

Part of the "Faster JavaScript" series

Timi, my pure JavaScript B-Tree, achieves best in class iteration speed in part because I replaced Iterators with callbacks.

They might be convenient, but the design of JavaScript Iterators is inherently slow for complex iteration as it prevents your compiler from inlining code.

Inlining is when the call site of a function is replaced with its body to avoid the overhead of a function call. So, this:

function add(a, b) { return a + b; } for (let i = 0; i < 10000; i++) { add(1, 2); }

Becomes this:

for (let i = 0; i < 10000; i++) { 1 + 2; }

Inlining can have a dramatic effect on performance. So it's worth paying attention to. When performance tuning, you can use the --trace-turbo-inlining flag to ask TurboFan (one of V8's optimising compilers) to report when it inlines a function.

$ node --trace-turbo-inlining add.js Considering 0x555db021c350 {0x01e444e159b1 <SharedFunctionInfo add>} for inlining with 0x555db021c6a8 {0x01e444e15df1 <FeedbackVector[1]>} Inlining small function(s) at call site #51:JSCall Inlining 0x555db021c350 {0x01e444e159b1 <SharedFunctionInfo add>} into 0x555db021c170 {0x01e444e15949 <SharedFunctionInfo>}

TurboFan and other optimising compilers can be encouraged to inline calls to your function by keeping it small and simple (and calling it many times). But in this post, I want to consider when inlining fails.

I'm going to deliberately write a function that won't be inlined:

I've padded the add() function from earlier with useless code to prevent TurboFan inlining it. I wasn't scientific about the process, I just typed until it stopped inlining.

$ node --trace-turbo-inlining add2.js Cannot consider 0x559cdbb7b350 {0x2918c14159b1 <SharedFunctionInfo add>} for inlining (reason: 5) Cannot consider 0x559cdbb7b350 {0x2918c14159b1 <SharedFunctionInfo add>} for inlining (reason: 5)

Of course, normally, inlining is prevented not by pointless padding but by necessary work. Iteration in Timi, for example, can involve complicated walks up and down the tree and the code is - necessarily - too complex to be inlined.

But why is this a problem for Iterators? Consider the following (I've omitted some imaginary complicated code for clarity):

class MyIterator { i = 0; next() { // ... Complicated code ... return { value: this.i++, done: this.i >= 1000, } } [Symbol.iterator]() { return this; } }

Which can be used with a for...of loop:

let total = 0; const iter = new MyIterator(); for (const value of iter) { total += value; }

It's a nice API but it hides a function call that I will expand. In reality, the loop executes like this:

let total = 0; const iter = new MyIterator(); while (true) { const {value, done} = iter.next(); // not inlined due to complex code! if (done) break; total += value; }

It makes a slow non-inlined call to next() every time around the loop. This is unavoidable with an Iterator. But it can be avoided with callbacks.

Now, the same iteration converted to callbacks:

function myIterator(callback) { for (let i = 0; i < 1000; i++) { // ... Complicated code ... callback(i); // Possibly inlined } }

Here, the loop is moved into the iterator and is therefore already inline with the complicated code. If the callback is simple, it too can be inlined leaving no function calls inside the loop.

Here's how it looks in use:

let total = 0; myIterator(value => { total += value; });

The effect of this optimisation is evident in benchmarks. Below, I've used the padding code I typed into add() as my 'complicated code' block and benchmarked each style using Deno.bench():

$ deno bench bench.js CPU | 11th Gen Intel(R) Core(TM) i7-1160G7 @ 1.20GHz Runtime | Deno 2.2.11 (x86_64-unknown-linux-gnu) file:///home/caolan/Desktop/bench.js benchmark time/iter (avg) iter/s (min … max) p75 p99 p995 ------------- ----------------------------- --------------------- -------------------------- runCallback 1.2 µs 846,100 ( 1.2 µs … 1.3 µs) 1.2 µs 1.3 µs 1.3 µs runIterator 4.8 µs 210,000 ( 4.7 µs … 5.0 µs) 4.8 µs 5.0 µs 5.0 µs

The Iterator is 4 times slower than the callback API. If I remove the padding code, the Iterator's next() method can be inlined and the performance gap narrows significantly:

$ deno bench bench.js CPU | 11th Gen Intel(R) Core(TM) i7-1160G7 @ 1.20GHz Runtime | Deno 2.2.11 (x86_64-unknown-linux-gnu) file:///home/caolan/Desktop/bench.js benchmark time/iter (avg) iter/s (min … max) p75 p99 p995 ------------- ----------------------------- --------------------- -------------------------- runCallback 1.2 µs 866,000 ( 1.1 µs … 1.2 µs) 1.2 µs 1.2 µs 1.2 µs runIterator 1.8 µs 571,100 ( 1.7 µs … 1.8 µs) 1.8 µs 1.8 µs 1.8 µs

So very simple iterators are often performant enough. Just be careful when your iteration is more complex.

The inversion of control provided by Iterators, generators, and Promises is powerful. But beware any of these in a hot code path. Manually converting these patterns to callbacks might uncover ways to improve performance.

—Caolan

Read Entire Article

Complex Iterators Are Slow

Related

Go60 – MoErgo

Sea Urchins Are Basically 'All Brain'

A.I. Cheating Rattles Top Universities in South Korea