Greenspun's 10th rule and the sad state of software quality

4 days ago 1

You've heard of Moore's famous law no doubt: a prediction made in 1965 that the transistor count per semiconductor chip area would double every year and adjusted a decade later in 1975 projecting a doubling every two years. While the transistor count curve has been a bit jerky lately it is still considered to follow the law meaning Moore's law is in effect for more than 60 years.

How could Mr. Moore have ever made such a deep prediction about an engineering discipline that was still in its infancy? He was a champion of his field and possessed the knowledge, experience and motivation to correlate a seemingly infinite number of variables and by that to uncover the Gestalt of the semiconductor industry, the very essence of the thing.

There are deep observations regarding software as well. Often formulated as aphorisms and jokes, people tend to initially overlook, misinterpret and dismiss them as cringy nerdisms but they bestow special powers to those who heed them. For one, there's Zawinski's law stating the following observation:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

It expresses the observation that programs tend to become overcomplicated and at some point do everything. This especially applies to programs you'd use a lot, like browsers, text editors, CAD tools, chat programs, video and music players.

We can juxtapose Zawinski's law to the "Worse is Better" principle by Richard P. Gabriel which holds that a program does not become more useful to users by adding more features. Partially because the users have other programs for these tasks, i.e. for reading mail.

Then there's Frisch's law:

You cannot have a baby in one month by getting nine women pregnant.

A-ha! Software development is an inherently complex task and that complexity simply requires a certain amount of time to materialize. We cannot shorten it by adding more developers because the number of developers is not the dominant factor.

Brook's Law takes this further.

Adding human resources to a late software project makes it later.

Not only is the number of developers not the dominant factor but from some point on it's most probably not a factor at all and just wastes time.

However the number of developers is a dominant factor when it comes to fixing problems as stated by Linus's law:

Given enough eyeballs, all bugs are shallow.

Conway's Law embodies the invaluable insight that the management style and the software product are tightly connected:

Any piece of software reflects the organizational structure that produced it.

Thus, we need to focus not only on the requirements but also take into account how the team, the department and the company operate and make changes if necessary. For example, if our developers don't hang around for too long our product's code would become patchy.

And, of course, Tom Cargill's ingenious adaption of the 80/20 rule to software:

The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.

Abandon all hope to deadline a project.

These laws are widely known by now and present in every book on software management in some form. However, there is one law that's definitely not even remotely as known because otherwise our world would've been a vastly different place, an Eden of perfect software that makes us smile. In contrast to Moore's law, it paints a depressingly grim picture and forebodes an ever-worsening software quality crisis driven mainly by ignorance and false beliefs. The crisis reveals itself when we consider software as a whole, the aggregate of all software products in use. Formulated by Mr. Philip Greenspun around 1993 and known as Greenspun's 10th rule, it goes to say:

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.

What is this childish nonsense and why am I even wasting my time? Fortran, what year is this? It sounds like an inside joke among computer weirdos. Surely ANYTHING mentioning Lisp cannot under any circumstances be taken seriously? After all, it's the programming language from that movie, "50 and still in my parents' basement". The funny language with the parentheses that looks like this:

(let ((obj (sqrt 16))) (print obj))

which prints 4 by the way but when we add a single quote

(let ((obj '(sqrt 16))) (print obj))

would suddenly print (SQRT 16).

What's the difference between

a) (let ((obj (sqrt 16))) ...)

b) (let ((obj '(sqrt 16))) ...)

a) means assign the square root of 16 to obj but b) means assign a two element linked list consisting of the function SQRT and the number 16 to obj. For Lisp '(sqrt 16) is a two element list but if we remove the quote Lisp interprets it as a function call where the function to be called is the first element of the list and the remaining elements are the arguments. So the difference between lists and code in Lisp is just a single quote and this is what we call homoiconicity and why we see so many parentheses.

There is no special syntax for if, case, for, while or function definitions in Lisp, there are only lists, strings and numbers¹. Homoiconic languages require minimal parsers and allow for powerful code transformation mechanisms known as macros.

Common Lisp is a dialect of Lisp that is particularly well designed and features a standard library that was unusually rich for the time of its conception.

With this basic understanding of Lisp, we can dissect Mr. Greenspun's words on three planes.

Developers choosing a language other than CL will spend considerable time writing utility boilerplate that CL already offers out of the box in a superior quality (tested, profiled, resource efficient). Furthermore, there is an abundance of 3rd party libraries for CL compared to other languages.
Software products need to be configurable through domain specific languages (DSLs) and extensible through scripting languages to be usable so implementing parsers, interpreters and supporting code is inevitable. Lisp is already ideal as a foundation for DSLs and quite capable as a scripting language. By deciding against Lisp, we'll implement something ourselves but it'll hardly match the quality of a Lisp solution. Because everybody else will also decide against Lisp, the cumulative quality of software would be low because the aggregate essentially contains n copies of the same thing of inferior quality compared to Lisp.
Lisp is so expressive that the choice of another language will make you struggle hard to make the code feel less technical and more semantical in comparison. Unexpressive code would suffer from many issues: it's hard to maintain and alter, it might contain hard to find bugs, its purpose would be elusive to new developers and so on. We could say inexpressive code negatively impacts software quality.

The first interpretation is nowadays obsolete. All major programming languages offer well-matured rich standard libraries and good ecosystems of 3rd party code. Taking C++, Rust or Python is quite a safe bet in that regard. There is an abundance of large projects in all of them so we need not debate their practicality. In fact, the situation is now reversed because most major languages feature standard libraries and ecosystems of 3rd party code much richer than CL. We'll see an example of this in a moment.

A closer look at the second interpretation however indeed reveals a software quality crisis not only persistent but worsening. To better understand the argument, let's look at a common web browser like Firefox, Chrome and Edge. They all include parsers for at least HTML, CSS and JS. Usually there'd be more parsers for browser internal purposes such as user preferences and session management. These parsers make a significant portion of the browser's code and, in cases where the parsed data comes from outside, offer many potential vectors for attacks. Thus, the browser developers need to take special care that the quality of the code remains high, i.e. through extensive testing. This task is basically impossible in the case of HTML because HTML code can contain CSS and JS as well so all three parsers have to be orchestrated.

Let's now entertain a thought experiment in which we alter the syntax – but not the overall structure – of HTML, CSS and JS to Lisp.

Before	After
<!DOCTYPE html> <html lang="en"> <head> <title>Page Title</title> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <style> body { font-family: Arial, sans-serif; } </style> </head> <body onload="sayHello()"> <h1>My Website</h1> <p>A website created by me.</p> <script> function sayHello() { alert("Hello!"); } </script> </body> </html>	(:html (:lang :en) (:head (:title "Page Title") (:meta (:charset :utf8)) (:meta (:name "viewport" :content "width=device-width, initial-scale=1")) (:stlye (defrule :body (:font-family ("Arial" :sans-serif))))) (:body (:onload (say-hello)) (:h1 "My Website") (:p "A website created by me.") (:script (defun say-hello () (alert "Hello!")))))
h1 { font-size: 14px; font-family: Arial, sans-serif; color: #ff0000; text-decoration:underline; } .label { white-space: nowrap; width: 30%; } #sidebar .entry { text-align: middle; }	(defrule :h1 :font-size (14 :px) :font-family ("Arial", :sans-serif) :color :text-decoration :underline) (defrule (:class "label") :white-space :nowrap :width (30 :percent)) (defrule ((:id "sidebar") (:class "entry")) :text-align :middle)
function goToUrl(url) { console.log("redirecting"); window.stop(); window.location.replace(url); } $("#redirect").on("click", function() { goToUrl("https://google.com"); });	(defun go-to-url(url) (log console "redirecting") (win-stop window) (win-replace (get-location window) url)) (define-event-handler :click (:id "redirect") (ev) (go-to-url "https://google.com")))

The left and right sides are quite alike and these changes have the immediate benefit that we can now use any Lisp parser to read the data and any Lisp system to execute the logic in the case of JavaScript. We've avoided the scenario of three languages intermixing, instead there's just Lisp now. The code quality, performance and security of the browser would increase dramatically. Unfortunately this is not how history unfolded. Bad decisions were made, the Web grew quickly and we suffer the consequences.

This comparison helps us understand Mr. Greenspun's scepticism regarding non-Lisp parsers expressed jovially as "ad hoc, informally-specified, bug-ridden, slow implementations of half of Common Lisp". Nothing is gained yet additional problems are created. Now consider there's not only HTML, JS, and CSS but JSON, SQL, GraphQL, Postscript, PDF, Dockerfile, YAML, TOML, INI and many many more lesser known.

Parsers for DSLs and scripting languages would often be written in multiple programming languages for "convenience" thus worsening the software quality and increasing the collective technical debt. Eventually, someone understood the necessity for a common denominator between DSLs and introduced XML five years after Greenspun's 10th rule but XML brought more headaches than it cured. Later, Douglas Crockford envisioned JSON to avoid writing data parsers in a JavaScript environment but the fundamental issue remained unaddressed.

In the present, the disaster keeps unfolding and even more wood is thrown into the fire. There is now a language called JSX which extends JavaScript with XML-like syntactic sugar for passing around React objects. It looks like this (the code is part of the react-router project)

import { useLoaderData, Link, Outlet } from "react-router-dom"; import { getNotes } from "../notes"; export async function loader() { return getNotes(); } export default function Root() { const notes = useLoaderData(); return ( <div style={{ display: "flex" }}> <div style={{ padding: "0 2rem", borderRight: "solid 1px #ccc" }}> <h1>Notes!</h1> <p> <Link to="new">Create Note</Link> </p> <ul> {notes.map((note) => ( <li> <Link to={`/note/${note.id}`}>{note.title}</Link> </li> ))} </ul> </div> <div style={{ flex: 1, padding: "0 2rem" }}> <Outlet /> </div> </div> ); }

Imagine having to write a parser for that and making even basic quality guarantees. If parsing XML alone wasn't already hard enough; now there's JS wrapping it with the ability to jump between contexts. How could anyone mistake this for something modern or practical.

Does the solution really have to be Lisp, though? Any language with lambdas will be good enough for the task. We could write the HTML example from above in pseudocode as follows.

["html" ["lang" "en"] ["head" ["title" "Page Title"] ["meta" ["charset" "utf8"]] ["meta" ["name" "viewport" "content" "width=device-width, initial-scale=1"]] ["style" lambda () { page_styles.add_rule("body", ["font-family" ["Arial", "sans-serif"]]) }]] ["body" ["onload" lambda () { say_hello() }] ["h1" "My Website"] ["p" "A website created by me."] ["script" lambda () { GLOBALS.say_hello = function () { alert("Hello!") } }]]]

But we'd quickly find ourselves confronted by many problems and would have to write even more awkward code to solve them, whereas Lisp was designed for this task from the very beginning.

Finally, the third way to think about Greenspun's 10th rule conveys the message that Lisp code will tend to be much more expressive because of the language's structure. Programming language designers try really hard to provide mechanisms for expressive code like annotations/decorators, lambdas, named arguments and pattern matching constructs but they cannot implement macros easily if the language isn't homoiconic although other Lisp features like symbols and keywords would be feasible. Expressivity is directly related to software quality because inexpressive code is hard to maintain long-term and problems are hard to spot.

Common Lisp's CFFI library for interfacing with C code is an excellent example of the language's expressivity. Here's some CL code to call SDL2 functions from Lisp I recently wrote and let's compare it to the Python ctypes code I would have written.

(define-foreign-library libsdl2 (:unix (:or "libSDL2-2.0.so.0" "libSDL2.so.0.2" "libSDL2"))) (use-foreign-library libsdl2) (defconstant SDL_INIT_VIDEO #x00000020) (defcstruct sdl-rect (x :int) (y :int) (w :int) (h :int)) (defcstruct sdl-surface (flags :uint32) (format :pointer) (w :int) (h :int) (pitch :int) (pixels :pointer) (userdata :pointer) (locked :int) (list-bitmap :pointer) (clip-rect (:struct sdl-rect)) (map :pointer) (refcount :int)) (defcfun "SDL_Init" :int (flags :long)) (defcfun "SDL_GetError" :string) (defcfun "SDL_LockSurface" :void (surf :pointer)) (defcfun "SDL_UnlockSurface" :void (surf :pointer)) (when (not (zerop (sdl-init SDL_INIT_VIDEO)) (error (sdl-geterror))))

import sys from ctypes import ( c_char_p, c_int, c_uint32, c_void_p, CDLL, POINTER, Structure, ) SDL_INIT_VIDEO = 0x00000020 class SDL_Rect(Structure): _fields_ = [ ("x", c_int), ("y", c_int), ("w", c_int), ("h", c_int), ] class SDL_Surface(Structure): _fields_ = [ ("flags", c_uint32), ("format", c_void_p), ("w", c_int), ("h", c_int), ("pitch", c_int), ("pixels", c_void_p), ("userdata", c_void_p), ("locked", c_int), ("last_blitmap", c_void_p), ("clip_rect", SDL_Rect), ("map", c_void_p), ("refcount", c_int), ] try: libsdl2 = CDLL("libSDL2-2.0.so.0") except OSError: libsdl2 = CDLL("libSDL2.so.0.2") libsdl2.SDL_GetError.restype = c_char_p libsdl2.SDL_LockSurface.restype = None libsdl2.SDL_LockSurface.argtypes = [POINTER(SDL_Surface)] libsdl2.SDL_UnlockSurface.restype = None libsdl2.SDL_UnlockSurface.argtypes = [POINTER(SDL_Surface)] if 0 != libsdl2.SDL_Init(SDL_INIT_VIDEO): print(libsdl2.SDL_GetError().decode(sys.stderr.encoding), file=sys.stderr) sys.exit(1)

I realize after writing it that the Python equivalent is much better than expected. It's overly string-y, of course, because Python lacks symbols and C structs have to be defined through classes with the awkward placeholder _fields because Python doesn't have macros but it's passable.

Why is Lisp not among the most popular programming languages if it’s that beneficial? That's actually one of the most intensely debated questions on the Internet. Nobody knows for sure. However, we don't need to axiomatize Lisp's superiority and could ask whether Lisp is really that much better instead. After all, Lisp could have numerous downsides along with its advantages. For one, it's not statically typed so you (or your users) will encounter 'three' is not a number errors if you're not extra careful.

My opinion is that Lisp's superiority is real but humanity, in contrast to other engineering fields, has completely given up on software quality and is willing to tolerate everything. We would be frustrated, yes, beyond all measure even, but keep using atrocious software without complaining and taking action. Maybe it has been so long since we used good software that we became numb. Also, bad quality is implied or even encouraged in the agile age of software development, remember Mark Zuckerburg's famous motto

Move fast and break things

Incidentally, JSX I just spoke of came from Facebook/Meta but I didn't write this to bash them because they're not alone in this mindset. Why should any company care for quality if we as developers and users don't?

Due to decades of conditioning Lisp doesn't provoke positive emotions in stakeholders and decision makers. A client will "We'll call you back." me in a nanosecond should I even mention Lisp because they subconsciously associate Python, React, and node.js with quality and prosperity. The whole picture of reliable software is simply not relevant.

People also speak of the Curse of Lisp, an apocryphal tale according to which Lisp programmers fail to unite around libraries and constantly reinvent the wheel individually because Lisp clouds their minds and fools them to see complex problems as simple ones. I don't share this sentiment, there is a good number of libraries for all kinds of purposes for Common Lisp. However, we could speak of the Curse of Programming altogether, because developers seem to be unable to consolidate their effort and would gladly duplicate code across languages and within the same language. There are positive examples, of course, such as the FreeType library considered the thing (tm) for text rendering but there are countless HTML parsers none of which is considered major.

So what should be done if the situation is so dire? Should we move to CL immediately? No, of course not, that's beyond unrealistic. You should learn Lisp and try to use it for your projects however because learning Lisp empowers you as a developer. More important is to think in terms of DSLs and always try to write expressive, declarative code. If the code feels too technical compared to hypothetical Lisp code it's not good unless being technical is the whole idea or you have serious performance issues. Use as much existing software as possible, avoid writing parsers and – when you absolutely must write a parser – use a parser generator. Reduce the collective technical debt instead of increasing it.

Consider the output format of git log:

commit 7f6c57f3612a3d1e2650e179ed9c9a79fe0fc9f9 Author: Mihail Ivanchev <[email protected]> Date: Tue Jan 28 11:02:30 2025 +0100 Reworked IO loop, simplification & removing restarts commit 137bd28dc2f37ee6cbb3fea0fd4b7e506c02bce0 Author: Mihail Ivanchev <[email protected]> Date: Sat Oct 19 17:23:28 2024 +0200 Supporting reproducible builds

Here is a Python program to parse the git log output that I've put together by copying and shortening the code for the same task in the jc tool.

import json import sys from typing import Dict, List def _parse_name_email(line): values = line.rsplit(maxsplit=1) name = None email = None if len(values) == 2: name = values[0] if values[1].startswith("<") and values[1].endswith(">"): email = values[1][1:-1] else: if values[0].lstrip().startswith("<") and values[0].endswith(">"): email = values[0].lstrip()[1:-1] else: name = values[0] if not name: name = None if not email: email = None # covers '<>' case turning into null, not '' return name, email def parse(data: str) -> List[Dict]: raw_output: List = [] output_line: Dict = {} message_lines: List[str] = [] for line in data.splitlines(): line_list = line.split(maxsplit=1) if line.startswith("commit "): if output_line: if message_lines: output_line["message"] = "\n".join(message_lines) raw_output.append(output_line) output_line = {} message_lines = [] output_line["commit"] = line_list[1] continue if line.startswith('Merge: '): output_line['merge'] = line_list[1] continue if line.startswith("Author: "): output_line["author"], output_line["author_email"] = ( _parse_name_email(line_list[1]) ) continue if line.startswith("Date: "): output_line["date"] = line_list[1] continue if line.startswith(" "): message_lines.append(line.strip()) continue if output_line: if message_lines: output_line["message"] = "\n".join(message_lines) raw_output.append(output_line) return raw_output if __name__ == "__main__": print(json.dumps(parse(sys.stdin.read()), indent=2))

I think you'll agree it's not expressive at all; in fact it's full of technicalities and the syntax is masking out what we're trying to accomplish instead of underlining it. Usually I'd never greenlight such code and I urge you also not to. It does exhibit excellent performance, though, which is also the most likely reason for its shape. Now consider this equivalent code that uses pyparsing, a popular parser generator for Python.

import json import sys from pyparsing import ( CharsNotIn, Combine, Group, LineEnd, Opt, Word, rest_of_line, Suppress, ) Padding = Word(" ") Eol = LineEnd() BlankLine = Eol CommitLine = "commit " + rest_of_line("commit") + Eol MergeLine = "Merge:" + Padding + rest_of_line("merge") + Eol Name = Combine((CharsNotIn(" <") + (" " + CharsNotIn(" <"))[...]))("author") Email = "<" + CharsNotIn(">")("author_email") + ">" AuthorLine = "Author:" + Padding + (Name + " " + Email | Email | Name) + Eol DateLine = "Date:" + Padding + rest_of_line("date") + Eol MessageLine = Suppress(" ") + rest_of_line + Suppress(Eol) MessageLines = BlankLine + Combine(MessageLine[...], join_string="\n")( "message" ) Entry = Group( CommitLine + Opt(MergeLine) + AuthorLine + DateLine + MessageLines + Opt(Eol) ) GitLog = Entry[...] if __name__ == "__main__": print( json.dumps( [ entry.as_dict() for entry in GitLog.leave_whitespace().parse_string( sys.stdin.read(), parse_all=True ) ], indent=2, ) )

Although noisy, the code is declarative and directly corresponds to the structure of git log's output. There are no ifs, fors and continues, only the definition of a grammar and its usage. The performance is not great, a common complaint with pyparsing, but maybe it could be improved. Just for completeness sake, here's a Common Lisp version using the parcom library:

(require 'asdf) (require 'alexandria) (require 'cl-ppcre) (require 'parcom) (require 'str) (require 'yason) (defpackage example (:use #:cl) (:local-nicknames (#:a #:alexandria) (#:p #:parcom) (#:re #:ppcre) (#:s #:str) (#:y #:yason))) (in-package :example) ;;; Utility functions ;;; ;;; (defun walk (fn obj) (if (atom obj) (funcall fn obj) (and (walk fn (car obj)) (walk fn (cdr obj))))) (defun flatten (obj) (let* ((tail (cons nil nil)) (head tail)) (walk (lambda (elem) (setf (cdr tail) (cons elem nil)) (setf tail (cdr tail))) obj) (cdr head))) (defun hash-kv (keys vals &key (skip-nil-values t)) (let ((res (make-hash-table))) (loop for key in keys for val in vals do (when (or (not skip-nil-values) val) (setf (gethash key res) val))) res)) (defun empty-string-to-nil (str) (if (s:emptyp str) nil str)) ;;; Parser ;;; ;;; (defmacro defparser (name parser) `(defun ,name (offset) (funcall ,parser offset))) (defmacro parse-bind (var parser &body body) `(lambda (offset) (p:fmap (lambda (,var) ,@body) (funcall ,parser offset)))) (defparser blank-line (p:*> #'p:space #'p:newline (p:pure nil))) (defparser remaining (p:<* (p:recognize (p:take-until #'p:newline)) (p:opt #'p:newline))) (defmacro line-starting-with (prefix &key (ignore-ws t)) (if ignore-ws `(p:*> (p:string ,prefix) #'p:space #'remaining) `(p:*> (p:string ,prefix) #'remaining))) (defparser entry (parse-bind fields (p:all (line-starting-with "commit " :ignore-ws nil) (p:opt (line-starting-with "Merge:")) (parse-bind info (line-starting-with "Author:") (let ((author (re:regex-replace "\\s+<[^<]+>$" info "")) (email (re:scan-to-strings "(?<=<).+(?=>$)" info))) (list (empty-string-to-nil author) email))) (line-starting-with "Date:") (parse-bind lines (p:*> #'blank-line (p:many (line-starting-with " " :ignore-ws nil))) (s:join #\Newline lines))) (hash-kv '(:commit :merge :author :email :date :message) (flatten fields)))) (defparser git-log (parse-bind entries (p:many (p:<* #'entry (p:opt #'p:newline))) (apply #'vector entries))) (let* ((input (a:read-stream-content-into-string *standard-input*)) (y:*symbol-key-encoder* #'y:encode-symbol-as-lowercase) (y:*symbol-encoder* #'y:encode-symbol-as-lowercase)) (y:with-output (*standard-output* :indent t) (y:encode (p:parse #'git-log input))))

Surprisingly for all the preceding glorification of Lisp, it also reads somewhat technical. We first need to write a custom flatten function because CL doesn't provide one. We write one using another function, walk, providing depth-first traversal of a list. Afterwards, a mini-language on top of parcom is introduced with macros and finally the data is parsed.

Wrapping this longer rant up, Mr. Greenspun offered very early a simple explanation of why our software would turn out bad and become a liability and a burden rather than a useful instrument aiding us in our lives. Writing good software is hard, no doubt, countless things are to be considered and our choices carry a responsibility when software as a whole is considered. We cannot just go ahead and code but need to be very careful what legacy we leave behind. Greenspun's 10th rule teaches us that the last 30 years of misery could've been avoided by not labeling a good solution as "nerdy" and "cringy" and simply ignoring it. Instead, we should consider our options without bias, look in all directions and try to communicate better with fellow developers instead of working in isolation. Our sights should be focused as much on the management of the collective technical debt and sustaining high code quality as on our current projects. In short, in our personal quest to become better developers we should etch the words of Greenspun's 10th rule onto our minds and offer the world the software renaissance it has been desperately longing for.

Read Entire Article

Greenspun's 10th rule and the sad state of software quality

Related

Snappaster – take picture with webcam and put path in clipbo...

What the failure of a superstar student reveals about econom...

Record/Replay in a VM on macOS: A Tutorial