A statistic is as a statistic does

2 weeks ago 1

(With apologies to the screenwriters of Forrest Gump)

I’m going to use this post to pull together some related threads from different sources I’ve been reading lately.

Rationalization as discarding information

The first thread is from The Control Revolution by the late American historian and sociologist James Beniger, which was published back in the 1980s: I discovered this book because it was referenced in Neil Postman’s Technopoly.

Beniger references Max Weber’s concept of rationalization, which I had never heard of before. I’m used to the term “rationalization” as a pejorative term meaning something like “convincing yourself that your emotionally preferred option is the most rational option”, but that’s not how Weber meant it. Here’s Beniger, emphasis mine (from p15):

Although [rationalization] has a variety of meanings … most definitions are subsumed by one essential idea: control can be increased not only by increasing the capacity to process information but also by decreasing the amount of information to be processed.

…

In short, rationalization might be defined as the destruction or ignoring of information in order to facilitate its processing.

This idea of rationalization feels very close to James Scott’s idea of legibility, where organizations depend on simplified models of the system in order to manage it.

Decision making: humans versus statistical models

The second thread is from Benjamin Recht, a professor of computer science at UC Berkeley who does research in machine learning. Recht wrote a blog post recently called The Actuary’s Final Word about the performance of algorithms versus human experts on performing tasks such as medical diagnosis. The late American psychology professor Paul Meehl argued back in the 1950s that the research literature showed that statistical models outperformed human doctors when it came to diagnosing medical conditions. Meehl’s work even inspired the psychologist Daniel Kahneman, who famously studied heuristics and biases.

In his post, Recht asks, “what gives?” If we have known since the 1950s that statistical models do better than human experts, why do we still rely on human experts? Recht’s answer is that Meehl is cheating: he’s framing diagnostic problems as statistical ones.

Meehl’s argument is a trick. He builds a rigorous theory scaffolding to define a decision problem, but this deceptively makes the problem one where the actuarial tables will always be better. He first insists the decision problem be explicitly machine-legible. It must have a small number of precisely defined actions or outcomes. The actuarial method must be able to process the same data as the clinician. This narrows down the set of problems to those that are computable. We box people into working in the world of machines.

…

This trick fixes the game: if all that matters is statistical outcomes, then you’d better make decisions using statistical methods.

Once you frame a problem as being statistical in nature, than a statistical solution will be the optimal one, by definition. But, Recht argues, it’s not obvious that we should be using the average of the machine-legible outcomes in order to do our evaluation. As Recht puts it:

How we evaluate decisions determines which methods are best. That we should be trying to maximize the mean value of some clunky, quantized, performance indicator is not normatively determined. We don’t have to evaluate individual decisions by crude artificial averages. But if we do, the actuary will indeed, as Meehl dourly insists, have the final word.

Statistical averages and safe self-driving cars

I had Recht’s post in mind when Reading Philip Koopman’s new book Embodied AI Safety. Koopman is Professor Emeritus of Electrical Engineering at Carnegie-Mellon University, he’s a safety researcher that specializes in automotive safety. (I first learned about him from his work on the Toyota unintended acceleration cases from about ten years ago).

I’ve just started his book, but these lines from the preface jumped out at me (emphasis mine):

In this book, I consider what happens once you … come to realize there is a lot more to safety than low enough statistical rates of harm.

…

[W]e have seen numerous incidents and even some loss events take place that illustrate “safer than human” as a statistical average does not provide everything that stakeholders will expect from an acceptably safe system. From blocking firetrucks, to a robotaxi tragically “forgetting” that it had just run over a pedestrian, to rashes of problems at emergency response scenes, real-world incidents have illustrated that a claim of significantly fewer crashes than human drivers does not put the safety question to rest.

More numbers than you can count

I’m also reading The Annotated Turing by Charles Petzold. I had tried to read Alan Turing’s original paper where he introduced the Turing machine, but found it difficult to understand, and Petzold provides a guided tour through the paper, which is exactly what I was looking for.

I’m currently in Chapter 2, where Petzold discusses the German mathematician Georg Cantor’s famous result that the real numbers are not countable, that the size of the set of real numbers is larger than the size of the set of natural numbers. (In particular, it’s the transcendental numbers like π and e that aren’t countable: we can actually count what are called the algebraic real numbers, like √2).

To tie this back to the original thread: rationalization feels like to me like the process of focusing on only the algebraic numbers (which include the integers and rational numbers), even though most of the real numbers are transcendental.

Ignoring the messy stuff is tempting because it makes analyzing what’s left much easier. But we can’t forget that our end goal isn’t to simplify analysis, it’s to achieve insight. And that’s exactly why you don’t want to throw away the messy stuff.

Read Entire Article