Getting Started with Randomised Testing

3 months ago 2

I'm somewhat obsessed with randomised testing. AFAIK, I'm the first consultant to do Deterministic Simulation Testing for a client. But when I am approached about implementing DST we quickly realize their existing codebase isn't ready for it; DST requires significant refactoring to isolate testable components from external dependencies.

A much easier starting point is Property-Based Testing, which is more amenable to smaller slices of deterministic code (which can be slim pickings in a legacy code base!). While smaller in scope than DST it offers massive benefits over traditional example-based tests, and is relatively easy to get started.

Definitions

Let's first define a unit test as a test of internal logic that does not interact with, mock, or simulate anything external like a database or remote API. (For some this may overlap with your definition of integration test; I ask you to bear with me)

Most unit tests are example-based tests - they encode specific actual and expected values.

Property-based tests, or PBTs, generalise this. Instead of giving concrete values for your test, you describe a range of values. The test runner then uses a seeded RNG value to sample from the range of values you specify, and checks if any trigger your assertions.

Both example-based and property-based tests are a form of unit testing.

A mathematical analogy

If you're mathematically inclined, you can think of example based tests tests as simple identities that lack variables, such as:

1 + 1 = 2 2² = 4 sin(0) = 0 cos(0) = 1

Property-based tests are analogous to mathematical identities that utilise variables, which implicitly or explicitly belong to some set:

1 + n = n 1ⁿ = 1 sin²(t) + cos²(t) = 1 (x * y) * z = x * (y * z) ∀ x,y,z ∈ S

An Example

Let's look at a basic shopping cart in typescript.

export type Item = { id: string name: string price: number quantity: number } export class ShoppingCart { #items: Map<string, Item> = new Map() add(item: Item): void { const existing = this.#items.get(item.id) if (existing) { existing.quantity += item.quantity } else { this.#items.set(item.id, item) } } remove(id: string, quantity: number): void { const item = this.#items.get(id) if (!item) return item.quantity -= quantity if (item.quantity <= 0) { this.#items.delete(id) } } getItem(id: string): Item | undefined { return this.#items.get(id) } get items(): ArrayIterator<Item> { return this.#items.values() } get totalPrice(): number { return this.items.reduce( (price, item) => price + item.price * item.quantity, 0 ) } get itemCount(): number { return this.items.reduce((total, item) => total + item.quantity, 0) } clear(): void { this.#items.clear() } get empty(): boolean { return this.#items.size === 0 } }

And here's some example based unit tests to verify the behaviour (imports elided for brevity):

test("adding items to the cart", () => { let cart = new ShoppingCart() const item = { id: "1", name: "Apple", price: 1.5, quantity: 2 } cart.add(item) expect(cart.getItem("1")).toEqual(item) }) test("calculating totals", () => { let cart = new ShoppingCart() cart.add({ id: "1", name: "Apple", price: 1.5, quantity: 2 }) cart.add({ id: "2", name: "Banana", price: 0.75, quantity: 3 }) expect(cart.totalPrice).toBe(5.25) }) test("removing items from cart", () => { let cart = new ShoppingCart() cart.add({ id: "1", name: "Apple", price: 1.5, quantity: 3 }) cart.remove("1", 1) expect(cart.getItem("1")?.quantity).toBe(2) }) test("empty cart should behave like it's empty", () => { let cart = new ShoppingCart() expect(cart.empty).toBe(true) expect(cart.totalPrice).toBe(0) expect(cart.itemCount).toBe(0) })

Wow, static typing AND unit tests? This code is indestructible... or is it!?

Enter Property-Based Testing

One of the great things about PBT is they immediately demonstrate how broken almost any piece of code is. To wit:

const arbItem: fc.Arbitrary<Item> = fc.record({ id: fc.string(), name: fc.string(), price: fc.double(), quantity: fc.double(), }) test("total price should never be negative", () => { fc.assert( fc.property(fc.array(arbItem), items => { const cart = new ShoppingCart() for (const item of items) { cart.add(item) } expect(cart.totalPrice).toBeGreaterThanOrEqual(0) }) ) })

The test - which generates a random array of random Items - checks if the following property holds true: is the total price of shopping cart always greater than or equal to 0?

It does not! Our PBT library (in this case fast-check) finds a simple example of an input that will trigger a failure:

[ { "id": "", "name": "", "price": -1.1890419245983166e-66, "quantity": 2.0775787447871207e-258 } ]

Our shopping cart has no defense against items with negative prices.

Note how "weird" the example is. Machines are fantastic at thinking up inputs we ourselves would not even think of. Most PBT libraries will heavily bias weird values (tiny numbers, zero, empty strings etc) that show up in bug reports.

Let's update the add method reject negative prices. And being clever, let's also foresee that quantity must also be positive:

add(item: Item): void { if (0 > item.price) { throw new Error("Item price cannot be negative") } if (0 > item.quantity) { throw new Error("Item quantity cannot be negative") } const existing = this.#items.get(item.id) if (existing) { existing.quantity += item.quantity } else { this.#items.set(item.id, item) } }

And let's update our PBT to make sure a failed add does not modify the state of the cart:

test("total price should never be negative", () => { fc.assert( fc.property(fc.array(arbItem), items => { const cart = new ShoppingCart() for (const item of items) { const empty = cart.empty const totalPrice = cart.totalPrice const itemCount = cart.itemCount try { cart.add(item) } catch { expect(cart.empty).toBe(empty) expect(cart.totalPrice).toBe(totalPrice) expect(cart.itemCount).toBe(itemCount) } } expect(cart.totalPrice).toBeGreaterThanOrEqual(0) }) ) })

Is all well? ROFL no. Here's the next value that breaks the code.

[ { "id": "", "name": "", "price": 0, "quantity": Number.POSITIVE_INFINITY } ]

At this point, I would probably use a schema to validate any potential items, so let's replace our if statements with a superstruct schema, and add some more logic for quantity:

import * as ss from "superstruct" const ItemSchema = ss.object({ id: ss.string(), name: ss.string(), price: ss.refine(ss.number(), "non-negative", n => n >= 0), quantity: ss.refine(ss.integer(), "positive", n => n > 0), }) export type Item = ss.Infer<typeof ItemSchema>

And now add reads like this:

add(item: Item): void { ss.assert(item, ItemSchema, "Invalid Item") const existing = this.#items.get(item.id) if (existing) { existing.quantity += item.quantity } else { this.#items.set(item.id, item) } }

Finally, our property-based test is silent!

Findings

Let's consider what we've uncovered by using PBT:

  • Even a straight-forward, 50 line typescript class is riddled with bugs.
  • Typescript's static typing alone is wholly inadequate for writing safe software. Even if Typescript is very poor in numeric types, most static type systems cannot express things like non-negative or positive numbers.
  • Example-based tests are simply not as powerful as property-based tests. We could have written more regular tests. We could have remembered to test for negatives, positives, non-safe integers... but would we have? How much longer - and thus less maintainable - would that be?

Think also about how much more confident we are in our shopping cart implementation. And how any future production bugs we get now can now be generalised as a property and added as a PBT (keen eyed readers may note we do not handle duplicate Item IDs).

My message - Embrace PBT

Confident developers iterate faster than developers riddled with doubt. PBT bring us one step closer to verifying behaviour in terms of requirements. If you'd rather find issues before your users do, PBT is a fantastic start. Most languages have a PBT library, so give it a shot.

Read Entire Article