Fuss-Free Universe Hierarchies

1 month ago 5

This is my blog, in which I write about a variety of topics including computer science, mathematics, and the design of tools for scientists. See also my for shorter-form posts; or see just my ; I also publish . You can subscribe to the if you prefer.

2025 10 6 https://www.jonmsterling.com/01HX/ 01HX /01HX/ Fuss-free universe hierarchies

The thing about universe hierarchies is that they are both painful to implement properly, and punting on them leads to even more pain later on because introducing a hierarchy is an extremely disruptive change for both the system and its libraries. After going through this treadmill a few times, I’ve learned my lesson: universes are something we need to get right from day one. Let me clarify that it is a non-goal to have inferred universe levels; I instead want to make working with universe levels as simple as possible, and to minimise the number of places where they need to appear.

The first thing people ask is whether we will use universes “à la Russell” or “à la Tarski”. This is a question laden with false ideology, because the only kind of universes that exist in an objective (i.e. generalised algebraic) sense are the latter. But there is a valid question lying under the surface, which is whether or not users will be expected to explicitly write the decodings and universe level coercions. To that, I answer that we have the elaboration technology today to reliably avoid these bureaucratic details cluttering the source language.

2025 10 6 https://www.jonmsterling.com/01HY/ 01HY /01HY/ So you want a coherent universe hierarchy…

Someone who wishes to deploy modern elaboration technology to achieve an implicit source language, however, must think very carefully about the algebraic formulation of the core language to ensure that it has enough equations to maintain predictable behaviour under the implicit Russell-style notation. For example, if the level coercions are not coherent enough, it is easy to create unacceptable error messages that appear to the user like “Could not unify A -> B with A -> B.” Unfortunately, strong enough core languages are very laborious to specify and implement; for example, you will need the following (or some variation):

You need top-level type constructors for all type theoretic connectives.
For each universe level i, a type U[i] : Type and a dependent type el[i] : U[i] -> Type and codes closing U[i] under all type theoretic connectives, including codes for U[j] with j<i. The el[i] operation needs to commute strictly with the type constructor codes.
For universe levels i <= j, you need a lifting operation lift[i<=j]: U[i] -> U[j] that commutes with all type constructor codes and is, moreover, definitionally functorial in the level preorder.
You need el to commute with lift. (Thanks to for catching this!)
You need both lift[i<=j] and el[i] to be definitionally injective.

There could be ways to refactor the above; for example, you could get rid of the top level of types altogether and just have universes. This is sometimes referred to as “universes à la ”, and is exemplified in works of like . Although admirable for its parsimony, I no longer like this approach because I agree with that types and their elements are different concepts. And, for engineering purposes, although it seems at first easiest to have only universes, I think there are subtle ways in which that choice can make other aspects of the implementation more complicated. I think the importance of a separate top-level notion of type is argued for very well in the .

Anyway, this is a lot to implement in the kernel of a proof assistant.

2025 10 6 https://www.jonmsterling.com/01I0/ 01I0 /01I0/ A new idea: Working your way down from the top

About a year ago, I started to get the inkling of an idea that we should not program with respect to a universe at all if we don’t need it. Instead, we should do large things by default (just like in everyday mathematics) and then when we have a need to talk about small things, we should just specialise them as-needed. For example, let’s consider the generalised algebraic theory of a category as phrased in univalent foundations:

We have a -type of objects C.
For every two x,y:C we have a -type of arrows C(x,y).
And a bunch of operations and axioms involving the above.

There is nothing about this definition that requires any particular constraints on the size of C and of C(x,y). In fact, the proper definition would not require that C(x,y) have the same size as C(u,v)! From the point of view of (generalised) algebra, these are just sorts. In fact, you never need to constrain the size of these things until you wish to start talking about things like the (bi)category of categories, etc.

In contrast, if you require all types mentioned in a theory to have a given size, you immediately create a lot of complexity. First of all, you will be automatically assuming that the category is locally small for some universe—by requiring that C(x,y) be V-small for all x,y; there can be times when this is an unnatural assumption. But maybe the bigger problem is that when you want to start developing the theory of functors and natural transformations and so on, you will end up having at a half-dozen universe levels annotating every definition. Implicit universe polymorphism is not a solution to this problem: implicitly resolving the levels does not remove the complexity, and any experienced mechaniser will know that you inevitably need to debug a universe constraint graph.

2025 10 6 https://www.jonmsterling.com/01I1/ 01I1 /01I1/ Some dream code

What I would like is the best of both worlds: to be allowed to develop the theory with fully unconstrained sizes, and then specialise the theory to more specific sizes later. Consider the following dream code for the theory of categories:

ob -> Type // operations and axioms...]]>

There is no implicit polymorphism happening in this definition. It is literally a very large theory of large categories. We can develop the theory of functors and natural transformations at this level too:

D.ob hom: (x y: C.ob) -> C.hom x y -> D.hom (ob x) (ob y) // axioms... theory NaturalTransformation where C: Category D: Category F: Functor / {C => C, D => D} G: Functor / {C => C, D => D} go: (x: C.ob) -> D.hom (F.ob x) (G.ob x) // naturality axiom...]]>

If we want to talk about small or locally small strict categories, this is going to be very easy, and we can reuse everything from the theory of functors and natural transformations by instantiation.

ob -> V inherit Category / {ob => ob, hom => hom} theory SmallCategory where U: Universe ob: U inherit LocallySmallCategory / {V => U, ob => ob}]]>

The above registers implicit coercions SmallCategory >-> LocallySmallCategory >-> Category. So if we want to talk about functors between X: SmallCategory and Y: LocallySmallCategory, we can do so by specialisation with no fuss:

Functor / {C => X, D => Y}

We will also be able to define the large category of small categories, again with no fuss.

ob encode/surjective: is-surjective encode theory CategoryOfSmallStrictCategories where U: Universe scat: Category scat.ob => SmallStrictCategory / {U => U} scat.hom X Y => Functor / {C => X, D => Y} // ...]]> 2025 10 6 https://www.jonmsterling.com/01I2/ 01I2 /01I2/ The theoretical foundation

I am now going to describe a way to set up a coherent universe hierarchy so as to ensure a parsimonious interpretation. I would like to emphasise that the can be elaborated to a , and that my reformulation here is only aiming to make this easier.

In essence the main idea is to think about the image of the universe decoding maps el[i]: U[i] -> Type. In the generalised algebraic theory, we have required that these encoding maps be injective; an easily overlooked consequence of the injectivity is that the image of the decoding map is a derivable sort in the generalised/essentially algebraic doctrine.

From this observation, we might instead consider a reformulation where instead of axiomatising the type codes and decoding maps and lifting coercions at each level and all the equations they must satisfy, we might instead axiomatise the images of the decoding maps in Type. This corresponds syntactically to defining a smallness judgement on types, which has a particularly extrinsic feeling to it. In fact, we would define this judgment using a judgemental partial ordering of universe levels; one example rule is the following:

(The real generalised algebraic theory would have a sort of smallness-proofs, but this sort is axiomatised as a proposition. The fact that coherence/proof-irrelevance is baked into the theory fully justifies a notation in which the proofs are omitted.)

Now, we recover the codes and the level coercions in one fell swoop by means of the following simple rules:

Now, supposing that the smallness and level inequality judgements have been filled out sufficiently, we can derive codes for connectives at the lower levels from the corresponding top-level type connectives. For example, consider the dependent function space:

Of course, the notation is a little noisy, but these operations are precisely the sorts of things that are easy for a bidirectional elaborator to insert (and then hide from the user). It works equally well for Agda-style hierarchies with joins:

The beauty of this simple approach is that we cut by two-thirds the number of universe-related operations we must implement in the core language. In that sense, we have an equivalent reformulation of coherent universe hierarchies that is more friendly on the engineering side.

2025 10 6 https://www.jonmsterling.com/01HZ/ 01HZ /01HZ/ What about universe polymorphism?

There are a few directions that one might go for introducing universe polymorphism. I will outline a couple options, both of which can be built on top of the above.

2025 10 6 https://www.jonmsterling.com/01I5/ 01I5 /01I5/ Explicit polymorphism with a sort or type of levels

In Agda, the levels themselves form a type; this simplifies some things, and makes other things more complicated; on the other hand, have proposed a stratified system in which levels can be abstracted but are not organised into an actual type. The main comment I would like to make here is that the formulation of universe heirarchies that I have will work regardless of whether levels form a type. The main syntactical challenge is to implement the smallness and inequality judgements, but is easily done in the style of as recently reformulated by . Put in terms of the latter, the type of universe levels is the free semilattice with successor on the presheaf of neutral terms of level type; and it is easy to decide equality in free semilattices with successor.

A semantic challenge for Agda-style levels is that the semilattice laws are very strict, and so an interpretation into a higher-topos model might be a problem; for example, we cannot rely on the natural numbers being strict in every model of HoTT, which means that the semilattice laws might not hold definitionally. But we can stratify things in such a way that the type of levels is not small for any universe; then the universe levels could be interpreted as the “exo-naturals” in the extensional outer-layer of .

2025 10 6 https://www.jonmsterling.com/01I4/ 01I4 /01I4/ McBride’s Crude but Effective Stratification

Many years ago, proposed what they called . Their idea is to define everything at the lowest possible level where it can live, but then subject the entire type theory to a functoriality in displacement operations, which allows developments to be shunted upward in the hierarchy as-needed. One way to think about the simplest version of Crude but Effective Stratification is that every top-level definition is polymorphic in a secret universe level variable, which can be restricted along edges in the universe level preorder. For more advanced functionality, you instead think of the top-level definitions as being polymorphic in a displacement.

Conor’s idea is very flexible. As pointed out, you could also have a universe hierarchy indexed in integers, and then shunt the developments both upward and downward. This one is justified in syntactical way—a semantic model of the system would not have have negative universe levels, but would instead pass through the observation that a given piece of syntax can only use finitely many universe levels.

Either way, the benefit of McBride’s universe displacements is that when you have an arrangement of universes that need to be in some linear relationship to each other, you can just pick the tightest possible instantiation (e.g. U, U+, U++), and then later use a displacement to create extra room between the universes.

Anyway, I think this is a good idea for handling universe polymorphism, but that question is a orthogonal to the design of the hierarchy itself. If we want, we can implement Crude but Effective Stratification on top of our foundation. I don’t plan to do this on day one, because I first want to see what the usability limitations are before introducing general displacements.

In , I will start with a preference for the to keep things simple. I am open to revisiting this in the future. By the way, I have already implemented everything I have described in this post in my secret Pterodactyl kernel (minus the elaboration itself).

2025 10 1 https://www.jonmsterling.com/01HN/ 01HN /01HN/ Lifting coercions to theory refinements

Last time, I for handling multiple inheritance and implicit coercions in ’s theory preorder. Today it is time to talk about bundling and refinement, and the rules for constructing and coercing implementations of refined theories.

2025 10 1 https://www.jonmsterling.com/01HO/ 01HO /01HO/ Refining bundled theories

When you have a theory, like that of semigroups, you define it in a fully unbiased/bundled way:

car -> car join/assoc: (x y z : car) -> join (join x y) z = join x (join y z)]]>

Then we might define monoids in terms of semigroups:

join unit x = x join/unit/right: (x : car) -> join x unit = x]]>

Suppose we want to describe the assumptions to the Eckmann–Hilton argument, where you have a pair of monoids sharing the same carrier set and satisfying a distributive law. For that, we need a way to refine a theory by a partial instantiation of its fields. Here is how it might look:

X} dist: (u v w x: X) -> M.join (N.join u v) (N.join w x) = N.mul (M.mul u w) (M.mul v x)]]>

Things like this are dealt with in the world of Rocq’s by using a “mixin” pattern: for every structure you predict in advance what unbundlings you are going to need, and then define a bundled version by taking a sum over these. This works very well, though it is a bit cumbersome and inflexible. I would like to build this feature directly into the language, and do it in such a way that you don’t need to predict unbundlings in advance.

2025 10 1 https://www.jonmsterling.com/01HP/ 01HP /01HP/ Introduction rules for refined theories

“Semantically”, a theory expression denotes a telescope in Martin-Löf type theory, and the type corresponding to a given theory expression should be the corresponding dependent record type / -type; a refined theory corresponds to a smaller telescope. But theories are also names onto which various operations can be attached, so we can’t just elaborate them directly to the record type, as we will lose information that we need to keep ahold of.

My thinking is that any theory expression T should induce a telescope <<T>> that consists of the remaining fields. So Monoid / {car => Nat} would correspond to the following telescope <<Monoid / {car => Nat}>>:

Nat -> Nat) join/assoc: (x y z : Nat) -> join (join x y) z = join x (join y z) unit: Nat join/unit/left: (x: Nat) -> join unit x = x join/unit/right: (x: Nat) -> join x unit = x]]>

Then every theory expression comes equipped with a constructor form

> -> T]]>

which allows the user to fill in whatever residual fields are not refined in .

2025 10 1 https://www.jonmsterling.com/01HQ/ 01HQ /01HQ/ Lifting canonical coercions to theory refinements

The is only tip of the iceberg in terms of how coercions must get dealt with. These coercions need to be extended coherently across the negative fragment of the type theory itself.

For example, if we have a synthesis form X:Monoid / {car => Nat}, we must be able to elaborate X:Semigroup / {car => Nat} seamlessly. There are a few ways that we might imagine this working. The simplest is to apply the constructor for Semigroup / {car => Nat} and then in each case apply the corresponding destructor of Monoid. I am not in love with this option because it leads very eager -expansion during coercion, which can cause term size to blow up.

I think there could be another implementation strategy which is more systematic. Let us consider a refinement scenario where we have refined a theory T by a partial instantiation R.

{} m m}{ \path coordinate (#1nw) ++(#2,-#3) coordinate (#1se) coordinate (#1sw) at (#1se -| #1nw) coordinate (#1ne) at (#1nw -| #1se) ; \path[spath/save = #1north] (#1nw) to (#1ne); \path[spath/save = #1west] (#1nw) to (#1sw); \path[spath/save = #1east] (#1ne) to (#1se); \path[spath/save = #1south] (#1sw) to (#1se); } \usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd} \usetikzlibrary{matrix,arrows} \usetikzlibrary{backgrounds,fit,positioning,calc,shapes} \usetikzlibrary{decorations.pathreplacing} \usetikzlibrary{decorations.pathmorphing} \usetikzlibrary{decorations.markings} \tikzset{ desc/.style={sloped, fill=white,inner sep=2pt}, upright desc/.style={fill=white,inner sep=2pt}, pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pullback 45/.style = { append after command={ \pgfextra{ \draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, ne pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm); } } }, sw pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm); } } }, dotted pullback/.style = { append after command={ \pgfextra{ \draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, muted pullback/.style = { append after command={ \pgfextra{ \draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pushout/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm); } } }, between/.style args={#1 and #2}{ at = ($(#1)!0.5!(#2)$) }, diagram/.style = { on grid, node distance=2cm, commutative diagrams/every diagram, line width = .5pt, every node/.append style = { commutative diagrams/every cell, } }, fibration/.style = { -{Triangle[open]} }, etale/.style = { -{Triangle[open]} }, etale cover/.style= { >={Triangle[open]},->.> }, opfibration/.style = { -{Triangle} }, lies over/.style = { |-{Triangle[open]} }, op lies over/.style = { |-{Triangle} }, embedding/.style = { {right hook}-> }, open immersion/.style = { {right hook}-{Triangle[open]} }, closed immersion/.style = { {right hook}-{Triangle} }, closed immersion*/.style = { {left hook}-{Triangle} }, embedding*/.style = { {left hook}-> }, open immersion*/.style = { {left hook}-{Triangle[open]} }, exists/.style = { densely dashed }, } \newlength{\dontworryaboutit} \tikzset{ inline diagram/.style = { commutative diagrams/every diagram, commutative diagrams/cramped, line width = .5pt, every node/.append style = { commutative diagrams/every cell, anchor = base, inner sep = 0pt }, every path/.append style = { outer xsep = 2pt } } } \tikzset{ square/nw/.style = {}, square/ne/.style = {}, square/se/.style = {}, square/sw/.style = {}, square/north/.style = {->}, square/south/.style = {->}, square/west/.style = {->}, square/east/.style = {->}, square/north/node/.style = {above}, square/south/node/.style = {below}, square/west/node/.style = {left}, square/east/node/.style = {right}, } \ExplSyntaxOn \bool_new:N \l_jon_glue_west \keys_define:nn { jon-tikz/diagram } { nw .tl_set:N = \l_jon_tikz_diagram_nw, sw .tl_set:N = \l_jon_tikz_diagram_sw, ne .tl_set:N = \l_jon_tikz_diagram_ne, se .tl_set:N = \l_jon_tikz_diagram_se, width .tl_set:N = \l_jon_tikz_diagram_width, height .tl_set:N = \l_jon_tikz_diagram_height, north .tl_set:N = \l_jon_tikz_diagram_north, south .tl_set:N = \l_jon_tikz_diagram_south, west .tl_set:N = \l_jon_tikz_diagram_west, east .tl_set:N = \l_jon_tikz_diagram_east, nw/style .code:n = {\tikzset{square/nw/.style = {#1}}}, sw/style .code:n = {\tikzset{square/sw/.style = {#1}}}, ne/style .code:n = {\tikzset{square/ne/.style = {#1}}}, se/style .code:n = {\tikzset{square/se/.style = {#1}}}, glue .choice:, glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool}, glue~target .tl_set:N = \l_jon_tikz_glue_target, north/style .code:n = {\tikzset{square/north/.style = {#1}}}, north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}}, south/style .code:n = {\tikzset{square/south/.style = {#1}}}, south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}}, west/style .code:n = {\tikzset{square/west/.style = {#1}}}, west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}}, east/style .code:n = {\tikzset{square/east/.style = {#1}}}, east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}}, draft .meta:n = { nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}}, sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}}, se = {\__jon_tikz_diagram_fmt_placeholder:n {se}}, ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}}, north = {\__jon_tikz_diagram_fmt_placeholder:n {north}}, south = {\__jon_tikz_diagram_fmt_placeholder:n {south}}, west = {\__jon_tikz_diagram_fmt_placeholder:n {west}}, east = {\__jon_tikz_diagram_fmt_placeholder:n {east}}, } } \tl_set:Nn \l_jon_tikz_diagram_width { 2cm } \tl_set:Nn \l_jon_tikz_diagram_height { 2cm } \cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 { \texttt{\textcolor{red}{#1}} } \keys_set:nn { jon-tikz/diagram } { glue~target = {}, } \cs_new:Nn \__jon_tikz_render_square:nn { \group_begin: \keys_set:nn {jon-tikz/diagram} {#2} \bool_if:nTF \l_jon_glue_west { \node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } { \node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$}; \node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$}; \draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw); \node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } \group_end: } \NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{ \__jon_tikz_render_square:nn {#1} {#2} } \NewDocumentCommand\DiagramSquare{D<>{}O{}m}{ \begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)] \__jon_tikz_render_square:nn {#1} {#3} \end{tikzpicture} } \ExplSyntaxOff ]]>

Now, suppose we have two refined theories T0/R0 and T1/R1. There are many possible functions T0/R0 -> T1/R1 but I wish to argue that only some of these functions are canonical. In particular, a canonical function T0/R0 -> T1/R1 is one that arises in the following way from a canonical coercion T0 -> T1:

To make it more intuitive, we can instantiate the scenario above with something concrete:

(Thanks to for pointing out some mistakes in the prior version of the diagrams above!)

The definition of canonical map between refined theories gives, immediately, an algorithm for lifting coercions to refinements. The algorithm involves checking that the squares commutes up to definitional equality. In the implementation, we would not literally define the induced canonical map, but instead we would give it as a syntactic constructor and associate to the correct computation rules. This is transparent from the point of view of the type theory, but it avoids unnecessary -expansions that would cause term sizes to blow up.

2025 10 1 https://www.jonmsterling.com/01HR/ 01HR /01HR/ Lifting canonical coercions to other type constructors

We will lift coercions to types in the standard way. What we will not ever try to do is lift coercions through any type constructors that lack -laws, such as parameterised inductive types. This is not only because handling coercions for parameterised inductive types is syntactically difficult (but see the ). Lost in the question of whether we can figure out how to deal with the syntactic difficulties of pushing coercions through positive types is the more troubling question of whether this can be made semantically compatible with the interpretation of inductive types in higher categorical models of type theory, where the “obvious” interpretations of inductive types involve some kind of fibrant replacement.

2025 10 1 https://www.jonmsterling.com/01HS/ 01HS /01HS/ Avoiding dependent coercions via “selfification”

Many treatments of coercions do not handle well the problem of when the type of the coercion seems to depend on the value being coerced. attempted to address this problem by introducing a notion of “dependent coercion”; also has a notion of dependent coercion. Luo and Soloviev’s version seems to have some issues, where the most expressive rules that one would want cannot be included without breaking transitivity elimination; Lean’s version , and I wonder if this is the reason why.

I am contending today that many useful cases of dependent coercions don’t actually need to be dependent. My idea is based off an old trick known only to the ML Modules Crew called “selfification”, which my PhD advisor kindly taught me several years ago. The idea of “selfification” is to transform a dependent coercion problem into a non-dependent one.

Suppose that we have a bidirectional elaborator, and we are now at the mode-shift. We have received some typed term M:T where T is a theory expression and we are trying to check it at type B. Rather than searching for a canonical coercion from T -> B, we instead first construct the maximal refinement “T/M” of T to a definitional singleton type that contains only M; this means refining all the fields of T to project the corresponding ones from . Then we try to find the canonical coercion from T/M to B.

(I think we can also generalise the algorithm to work even when the synthesised type is not a theory expression, but I won’t talk about that today. In case I forget, the idea is treat the mode shift as always being between theory types, which can be done by having a built-in theory for typed terms.)

I want to show an example. Consider the case of trying to define the following dependent function:

X.car} cool X => X]]>

The idea here is that (1) if X is a monoid, then it’s a semigroup by forgetting the extra stuff; and (2) forgetting this extra stuff leaves the carrier set fixed. Can we justify returning X as above via elaboration?

Yes. The elaboration state at that point is asking us to convert X : Monoid to Semigroup / {car => X.car}. Obviously we cannot hope to find a coercion Monoid -> Semigroup / {car => X.car}, but what we do instead is attempt to find a coercion from

Monoid / {car => X.car, join => X.join, unit => X.unit, join/assoc => X.join/assoc, join/unit/left => X.join/unit/left, join/unit/right => X.join/unit/right}

to Semigroup / {car => X.car}. And that is something we can evidently do, using !

In summary, we have very strong dependent-type-friendly coercions, without needing the coercions themselves to be dependent functions.

2025 9 30 https://www.jonmsterling.com/01HB/ 01HB /01HB/ Labelled preorders and implicit coercions

aims to support coherent algebraic hierarchies with multiple inheritance as well as user-defined edges. As I mentioned in my , there are a few points worth mentioning here in regard to the so-called “diamond problem”.

2025 9 30 https://www.jonmsterling.com/01HD/ 01HD /01HD/ What is the diamond problem?

The first thing to get out of the way is that the phrase “diamond problem” means different things to different people. There are actually two different diamond problems that arise when building systems that support implicit coercion; these two problems are actually mutually exclusive, in the sense that you can have only one version of the diamond problem at once.

There is the coherence problem, which concerns ensuring that diamonds (and loops) in the coercion graph commute. A system in which all diamonds and loops commute is called coherent. In a coherent system, backtracking is never required and so the resolution of coercions can be extremely efficient. This is the “diamond problem” for Theory B computer scientists.
In an incoherent system, resolution of coercions must traverse all possible paths from the source to the target, and the presence of diamonds and loops in the graph can cause an exponential blowup for the naïve backtracking algorithm. This is the “diamond problem” for Theory A computer scientists.

Indeed, if your system is coherent then no backtracking is required; but you must find a way to maintain the coherence invariant. On the other hand, if you have an intentionally incoherent system, then you do have to deal with the exponential blowup, but you don’t have to check coherence because you don‘t care about coherence. This is the sense in which the two problems are mutually exclusive.

2025 9 30 https://www.jonmsterling.com/01HE/ 01HE /01HE/ Approaches to the diamond problem(s) in existing systems

Here I outline some (but not all) existing approaches to dealing with implicit coercions, multiple inheritance, and the various diamond problems.

2025 9 30 https://www.jonmsterling.com/01HF/ 01HF /01HF/ Coherent type classes in Haskell

Haskell employs a syntactic condition to ensure that type classes instances do not overlap, leading to a global coherence property. Practice in the Haskell community has shown that it is possible (but painful) to develop coherent hierarchies. Many advanced libraries will activate a language extension to allow overlapping instances, which renders the hierarchy incoherent.

2025 9 30 https://www.jonmsterling.com/01HG/ 01HG /01HG/ Fast but incoherent type classes in

Type class hierarchies arising in mathematics tend to be unavoidably incoherent; therefore, systems like implement a backtracking type class resolver and are therefore confronted with the second diamond problem above. The Lean team have discovered an algorithm that they call that mitigates the exponential blow-up using ideas from logic programming.

The state of affairs for Lean is, then, that type class resolution is pretty fast but the actual resolution of coercions is in theory non-deterministic; in reality, it does the same thing every time (so far as I know), but there is no guarantee about which coercions it is choosing.

2025 9 30 https://www.jonmsterling.com/01HH/ 01HH /01HH/ Coherent locale hierarchies in Isabelle

Locales provide a coherent way to organise mathematical hierarchies. The problems that lead type class hierarchies to become incoherent do not seem to apply to locale hierarchies, I think, because the resolution of the locale’s operations is name-based rather than component-based. (I am not completely certain that this is the full story, but it seems like it is part of the story.)

One way to think about locales and the reason they can remain coherent is by analogy with Haskell’s newtype trick, which Haskell programmers use liberally to maintain coherence, but taken to its logical conclusion. Instead of generating a fresh copy of a type in order to indicate its intended role in the type class hierarchy, Isabelle doesn’t hang identity on the type at all and instead just starts with the name that carries intentional force.

The coherence of the locale hierarchy in Isabelle is checked when activating a locale as a local theory. Here, an algorithm called “roundup” reaches through the entire locale inheritance graph and brings all the needful operations into scope, and if there is a clash (two names pointing to different things), this results in an error. Such a clash always corresponds to a failure of coherence, and conversely, failures of coherence will always lead to such clashes.

2025 9 30 https://www.jonmsterling.com/01HI/ 01HI /01HI/ Coherent coercions in Rocq

Rocq has a coercion mechanism, which checks coherence and warns users when they are installing incoherent coercions. Coercions are most used in the world of in combination with the canonical structures feature, which simulates type-class like features using something like unification hints (to my knowledge).

The algorithm used in Rocq was created by and described in the PhD thesis . The idea here is to maintain the reflexive-transitive closure (path category) of the coercion graph at all times; when inserting a new coercion, you check a generating class of diamonds for coherence and then add all the induced paths. Coherence then ensures that the path category is a preorder, with a functorial embedding into the syntactic category of the language under consideration.

2025 9 30 https://www.jonmsterling.com/01HJ/ 01HJ /01HJ/ Evaluating: Isabelle‘s roundup algorithm vs. Rocq’s coercion graph

I am ruling out from the outset, because these lead to unpredicatable resolution. The two viable approaches seem therefore to be and ; I believe that the latter will work best for our needs, even if our goal is to enable an Isabelle-style user experience.

The thing with Isabelle’s “roundup” algorithm is that it seems to rely heavily on the idea that the instances of a locale in scopes are fully top-level, so that you can walk through them and eagerly install all their operations into the appropriate scope. In this sense, Isabelle’s implementation of locales does not involve implicit coercion at all. In a dependent type theoretic setting, we must account for things like families of structures and ensuring that we can project components from these as well. For example, if we have a constant , we will need to be able to call things like which would involve passing through the coercion to semigroups, etc. I am not sure how to capture something like that via a roundup-like algorithm.

For this reason, I believe that the best foundation for Pterodactyl’s handling of algebraic hierarchies will be ’s labelled preorders as deployed in Rocq.

2025 9 30 https://www.jonmsterling.com/01HK/ 01HK /01HK/ A clean-room implementation of ’s labelled preorders

In order to learn how the algorithm works, I built a clean-room generic implementation in of of the labelled preorder structure following . (Oh, did I mention I’m using Swift?)

Our case will be a bit simpler than that of Rocq. In Rocq, coercions can be between arbitrary things, and therefore the vertices must include definitional equivalence classes of terms. That seems very tricky! We will be using a more restricted kind of coercion in Pterodactyl, where the vertices are only theory names. (We might later on also support coercions to sorts like those of types and functions, like Rocq does, which would not be difficult.)

As a result, the only thing that we need to use expensive definitional equality checks for is the coercion paths themselves. It will indeed be pretty costly to install coercions, but resolving them will be very cheap: it is a single lookup in a hashtable by a totally discrete key. My code is available .

I am not certain that we will use literally this code (abstracting and modularising too early can be hazardous), but it’s a good start.

2025 9 27 https://www.jonmsterling.com/01H3/ 01H3 /01H3/ A focused vision for

After a few very enlightening conversations following my seminar talk , I have realised that one of the more important aspects of my plans for is its focused and restricted vision. I cannot promise that we will succeed, but I do think that by not trying to be all things to all people, we may be able to avoid some likely ways of biting the dust.

2025 9 27 https://www.jonmsterling.com/01H4/ 01H4 /01H4/ We are not building a general-purpose programming language

Any implementation of dependent type theory is a programming language of sorts, following the Curry–Howard correspondence, but I have come to believe that to build an actually good programming language (like OCaml, , Rust, or Idris) is a very different task from building an actually good proof assistant (like Rocq, Lean, or Isabelle).

If you have , you can do both at the same time (like ). If you don’t have enough resources, I think you end up with something that is going to be inadequate as either a general-purpose programming language or as a proof assistant.

I do think that this is a point we could revisit in the future, but only after we have achieved our primary goals. Not having general-purpose programming as a primary goal doesn’t mean we can’t write programs in the language; I think that, instead, punting on things like “Being competitive with OCaml and Rust” means that there will be more space for ambitious dependent-type-native people to explore the possibilities of dependently typed programming without having to answer questions for which we don’t yet have good enough answers.

2025 9 27 https://www.jonmsterling.com/01H5/ 01H5 /01H5/ We have no plans to self-host, ever

When you build a language you really like, it is natural to want to rewrite its compiler in itself. This is a reasonable impulse, but I believe that it can also create inertia and painful coupling as you have to negotiate changes to the language with changes to the compiler in both directions rather than in only one direction.

Here’s the thing: if we were to write a highly-performance-constrained system (like a proof assistant) in itself, the proof assistant’s compiler and execution model would need to immediately be competitive with state-of-the-art programming languages with vastly more resources and expertise behind them. I do think that we can build something that is competitive with Agda, but I do not think we could build something that is competitive with both Agda and Rust or OCaml at the same time.

I believe that it is a good goal to have a system that can do both. But I think that in order to get there in a resource-constrained environment, it is very important that the goals be achievable separately. If we have to achieve both in order to achieve one, we will realistically achieve none.

2025 9 27 https://www.jonmsterling.com/01H6/ 01H6 /01H6/ We do not have plans for general-purpose metaprogramming

One of the most interesting things about Idris 1, Agda, and is their approach to metaprogramming. In either system, you can hook directly into their elaboration monad and write metaprograms directly in the language, and then designate certain of these metaprograms as tactics or macros.

I think that this is an excellent model, but one of the reasons I am planning to shy away from it is that I believe it introduces a great deal of coupling between user-land code and the implementation details of the system (including its representation of syntax, its approach to elaboration and the suspension of elaboration problems, its approach to unification, etc.).

This is similar to the ; with enough resources, you can pat your head and rub your belly at the same time, but I believe that avoiding the coupling that metaprogramming introduces for the longest time possible will increase our probability of achieving our primary goals.

2025 9 27 https://www.jonmsterling.com/01H7/ 01H7 /01H7/ We do have plans for procedural reasoning

Some community members have expressed skepticism about what might be lost in reducing everything to a problem-oriented programming model like that of . For example, a large part of typical tactic scripts in Rocq amounts to procedurally generalising goals and calling eliminators, and this is indeed well-handled by the McBride—McKinna model; on the other hand, there are many other useful pieces of tactical reasoning like apply f or eapply f that aren’t discussed in that paper. If we built a system in which you could not easily backchain through a lemma, it would indeed be a pretty unrewarding system to use.

The particular case of backchaining through lemmas can, however, be addressed within a very small extension of the model. For this, we will need what might be referred to as the “view from the right”. There is an analogy to be had here:

defined a general notion of “elimination operator”, and devised a pattern matching notation for it.
We seem to need to define a general notion of “introduction operator”, and then devise a co-pattern matching notation for it.

In Epigram, we can think of the ability to return a value (written ) as the beginning of a “view from the right”, which is just lacking any kind of right-handed decomposition. Our goal is to add this decomposition.

Consider the case of showing that a particular sum is a proposition, using a lemma about the closure of propositions under internal sums. Here is our initial proof state as it might appear in our editor:

Next, we use the backchaining introduction operator, which generalises the usual “return” / co-gadget of Epigram to allow for subgoals.

Note that with records, we can easily arrange for the subgoals of to have more human-readable names than , as we’ve done above. The first goal is easy to solve using another lemma; that lemma has no subgoals, so that branch is finished off like so:

In the second branch, we can abstract the variable.

Next, we backchain through yet another lemma, namely that propositions are sets.

Finally, we finish up the problem with a lemma that has no subgoals.

Personally, I find the above to strike a good balance between readability and writeability while maintaining the very important procedural aspect of how mathematicians write proofs; it is an extremely small generalisation of the McBride–McKinna model.

We can workshop the notation and improve it, but I think the underlying idea here is viable. For those who think “But that is exactly what we already had with tactics”, that is actually my entire point: I don’t want to get rid of procedural tactical reasoning, but I do want to find a way to incorporate it into a problem-oriented notation that mostly documents itself .

I won’t say that this solves all problems, but I hope it gives some impression of the methodology by which we will tackle procedural reasoning.

2025 9 27 https://www.jonmsterling.com/01H8/ 01H8 /01H8/ Stepping through proofs is important

It is important to be able to see what the goal was at a given point in a proof that has already been completed. If we return to our :

We could instrument the language server to emit either tooltips or expandable inlays on each or that will display the goal that is being refined. For example, I plan to make it so that if you select the second in the above, the editor would display something like and if you select the third , the editor would display something like

Ideally, selecting the occurrences of the backchained lemmas on the right-hand side would lead to a display that reveals exactly how they have been specialised for the goal. For example, if you select the occurrence of the editor might display something like the following:

2025 8 17 https://www.jonmsterling.com/01DI/ 01DI /01DI/ Thoughts on XSLT in the “web platform”

Mason Freed, a Chromium/Blink developer employed by Google, opened a ticket a couple weeks ago on the WHATWG/html repository: Typical of the simultaneously high-handed and ignorant approach of the browser vendor cartels that have come to dominate the Web, Freed is proposing that client-side XSLT support be removed from the “web platform” (which is cartel code for “whatever We decide”). Like on the Open Web, this one is backed by concerns about “security”—as if the only way forward was to keep using libxslt without paying its developers.

Mason’s is completely unworkable, as such a polyfill would increase payloads by an order of magnitude—and it wouldn’t even work for most use-cases where the goal is to serve an XML document that is automatically rendered, rather than to serve an HTML document that embeds the result of transforming an XML document at load-time. It’s a non-starter. Why do people who know nothing have authority over people who know something? I guess it’s the way of the world.

Now, I understand that XSLT in the browser is not as commonly used as it was in the past. Moreover, real industrial users of XSLT are not using XSLT 1.0 but rather later more fully-featured versions like 3.0 via tools like Saxon, etc. But one very common use of client-side XSLT is to render RSS/atom feeds (for blogs and podcasts—yes, people still do those!) in a user-friendly way when viewed in the browser; of course uses client-side XSLT as well. Since the writing has been on the wall for a while, I have been , but I would prefer not to be forced into this by the browser cartels.

Many of the people whose comments have not been censored in the seem to be entirely unaware of the fact that RSS even exists. My guess is these viciously ignorant people think that podcasts are things that you subscribe to on Spotify, or watch on YouTube.

Anyway, I was thinking. I wonder if the health and security of the Open Web would be greatly benefited if Mason Freed and his colleagues in the other major browser vendors did not have the ability to change the Web without consent of its actual constituents—the people who publish websites. Using a tiny fraction of these vendors’ resources, we could fund the amazing team building (a modern implementation of XPath 3.1, XQuery 3.1, and XSLT 3.0 in Rust), or any one of several other teams working to implement modern XML tooling in open source. I’d love to restore the “web platform” to its rightful owners—the PEOPLE. I don‘t know if is the solution, but I’m glad to see some competition in this space from outside the cartel.

To Mason: I’m sure you’re a fine guy in real life. There is a sense in which you are only a vessel for these proposals, which would be made by someone else if you weren’t involved. But I think there are plenty of people for whom there would be no salary big enough to get them to let the cartel’s words flow through their lips so uncritically. I think those are the kind of people we need in stewardship over the Web.

See also this .

2025 8 6 https://www.jonmsterling.com/01C9/ 01C9 /01C9/ : a work-in-progress newsreader for the Macintosh

During my Summer holiday, I began working on a new newsreader for the Macintosh which I have tentatively named Newsradar. I am a mostly happy user of the great by , but I have an idea for a few unique features that would greatly improve my own newsreading workflow. I also believe strongly in the capability of syndication (in the form of RSS, Atom, JSON Feed, etc.) as a much simpler and cheaper alternative to federation as a means to reclaim the World Wide Web. Obviously, this is a fully native app that complies with Macintosh user interface conventions.

2025 8 6 https://www.jonmsterling.com/01CE/ 01CE /01CE/ Organising feeds with labels instead of folders

Most newsreaders organise their feeds into folders: that means that one feed appears in at most one folder. I have found that it is sometimes unnatural for me to decide upon one categorisation of my feeds. For example, I subscribe to many people’s weeknotes feeds; ideally all these would be in one folder. On the other hand, I also subscribe to various blogs from people and groups locally in Cambridge and at times I would like to view just the Cambridge Feeds. The problem is that some of these are weeknotes: so I must decide whether to put (e.g.) a student’s weeknotes in the Weeknotes folder or in the Cambridge folder.

To solve this problem, I am switching to a style of organising feed subscriptions based on labels or tags (like Gmail does for individual emails). A label is like a folder except that a given feed can have multiple labels without duplicating the subscription.

2025 8 6 https://www.jonmsterling.com/01CJ/ 01CJ /01CJ/ Deduplicating articles that appear in many feeds

An overlooked reality of syndication is that a given article may appear in several feeds. For example, some of my book reviews get posted both to my and to , and it is conceivable that someone might subscribe to them both. My model allows the same article to have multiple “sources”; of course, it is possible that the content may differ, and I will eventually come up with a way to flag this to the user. One thing to keep in mind, however, is that it is not necessary to account for all possible contingencies in an environment like newsreading where there is no adversary.

2025 8 6 https://www.jonmsterling.com/01CF/ 01CF /01CF/ Smart collections for news triage

Sometimes you want to constrain your view to something more (or less) specific than a single label. For example, you might want to look at the union of two labels, or constrain by publication date, etc. For this, I am introducing Smart Collections, which are essentially named queries; these are similar to Smart Folders in the Macintosh Finder, but my notion of predicate is more sophisticated (unions and intersections can be nested arbitrarily).

2025 8 6 https://www.jonmsterling.com/01CH/ 01CH /01CH/ In the future: posting to your own feed!

for an RSS reader-writer app; this would be an app that lets you post to an RSS feed just as easily as you subscribe to others. I actually believe this would entirely replace the functionality of social media that I benefit from, and conveniently it would be poorly adapted to replace the parts of social media that I don’t benefit from. Here’s what Louie says:

As Twitter crumbles and people scatter into federated social media, I remembered we already had a method to self-publish posts, large or small: RSS. I think a big reason it hasn’t caught on for most people is that there’s not a good method to open an app, compose a post, and send it. I think it could exist, and maybe it would look like this.

I think this is the right direction. I think it will take a bit of thought and design work to do this right, but I believe it’s worthwhile. It is worth noting that the blog authoring app was originally part of . It is high time to consider a modern take on the unification of reading and writing, building on what came before as well as newer ideas drawn from modern social media like Dave Winer’s concept.

2025 8 6 https://www.jonmsterling.com/01CG/ 01CG /01CG/ Let’s get back to syndication!

Federated social media like Mastodon presents a number of downsides and even dangers to users and operators:

When you run a social media site, you are subject to the Law—not only of your own country, but even other that of foreign countries. Maybe you are small enough that nobody notices, but at some point, there is going to be a regulator knocking on your door telling you that CSAM or other illegal material has been posted on your website and you should not be surprised to find out that you are responsible for that. At times you will also be obligated to censor your users who post from the point of view of , and it is worth considering whether this may in the future entail a legal obligation to censor those who are speaking out to prevent a new Holocaust, etc.
Moderation at the “instance” level comes into contradiction with the federation model, because (1) different users have different legitimate preferences, (2) the prevalence of culture war engagement among instance owners leads inexorably towards mass defederation, and the “normies” are constantly getting robbed of their social graphs as a result whilst having no idea what controversy they inadvertently stepped into. To be very clear, I support people having spaces with moderation policies that protect them from harm; but I believe that making these decisions at the granularity of instances rather than individual people does not stabilise over time.
Each person running their own federated Mastodon instance seems to be too expensive (putting aside the need for system administration, etc.). When you start to federate with high-volume instances, you can get totally blown up.

As , we don’t need content moderation at all with ordinary syndication via RSS, Atom, etc. The big reason content moderation is needed on present-day social media platforms is, aside from legal requirements for platform operators, the insistence on unrestricted replies and the “discoverability brainworm” (Louie’s coinage) that has made the Web into a place that serves influencers, hucksters, and narcissists to the detriment of artists and intellectuals. The way to escape from this is to stop trying to be platform operators; the Web is the platform.

I understand that many people want to use the Web to make money or become famous; but I am equally within my rights to wish to use the Web to communicate ideas, create and consume educational materials, and socialise with my friends and family. It is a good thing rather than a bad thing to create tools that explicitly do not support the kind of fame-seeking behaviour that has turned the Web into a hellscape of scams, advertising, shilling, gambling, and exploitation. I do not judge the way any person chooses makes a living, but I know what I do and do not want in my life.

In light of recent efforts by content platforms to blacklist and “de-host” creators of media deemed unsavoury by the large financial institutions, I believe that syndication via RSS/Atom/etc. is a comparatively safe direction for individual sovereignty in our use of the Web, and it luckily costs almost nothing to host a feed (and you can get a or account and just not worry about it if that’s your speed). What is needed, however, is software to make it easier for ordinary people to “jack in” to the World Wide Web in a way that sticks.

2025 8 6 https://www.jonmsterling.com/01CI/ 01CI /01CI/ Stay tuned for more!

The software is in a very early prototype stage right now, and I am considering the design in light of Apple’s recent ; it is really a challenge to develop a usable app whilst hewing to Apple’s haphazard design language, but I believe there is also an opportunity here to go above and beyond and return to the great era of polish in applications for the Macintosh.

I am also considering funding models. One option is to sell the app in the traditional way; I am resistant to subscriptions because I would never personally subscribe to an app like this. I understand that the economics of the one-time purchase don’t work well anymore, but I am not looking to make a living from this; the subscription model rarely provides enough value for users, who end up feeling exploited—indeed, software today in the subscription era costs far more than software did in the 2000s even taking into account yearly upgrade prices. The blown up costs of good software today is one of the reasons why there is, I believe, so little diversity in the software market compared to the old days when you could spend $40 on an app and use it forever until you choose to upgrade.

I would also be open to a donation model, but I have to consider the matter carefully.

Feel free to write to me if you are interested in trying out an early version when that is ready (it won‘t be soon).

2025 7 31 https://www.jonmsterling.com/01C8/ 01C8 /01C8/ The Greatest Predator Ever Known!

’s is the most delightfully unhinged novel I have ever read. I cannot deny that I have loved all three of the preceding Dune novels, perhaps even more so than God Emperor. But just as Leto II intentionally reduced the mystique of Paul-Muad’Dib to parody in order to deliver Humankind from a horrifying future, I suppose Herbert’s ruthless approach to our memory of wind-swept Arrakis may have been a requirement to make room for the larger-than-life whimsy of God Emperor.

I have always thought that we should write what we know deeply. The problem with Dune and its copious epigrapha—often straight from the blessed lips of Muad’Dib—is the combination of absolute seriousness and near-total lack of signal in his wordy pronouncements. Was Muad’Dib a bullshitter, or was Herbert the bullshitter? I think Herbert aimed to portray Paul-Muad’Dib as a great (but fatally flawed) thinker of another world and time, but the danger of writing speculatively about things you do not know deeply (philosophy, history, the State, and the Spirit) is that successive generations may find the result less credible than did the , whose more limited access to the breadth of human culture tended to generate “profound” insights that are quickly hackneyed.

God Emperor turns all this on its head—Leto II’s wordiness and tendancy to unclear (meaningless?) pronouncement is legendary (and a thorn in the side of his retainers and his enemies alike), but that is one of the privileges accorded to a God. Leto II is the greatest bullshitter to ever live, and he knows it! The Golden Path, that tenuous thread in the tapestry of time by which Mankind may narrowly avoid total extermination by prescient machine minds, demands nothing less than Leto’s unintelligible tyranny. Who are we to question a God, and who can be Shai-Hulud besides the only living sandworm?

2025 6 12 https://www.jonmsterling.com/01BX/ 01BX /01BX/ Thoughts on Apple’s new design language

As many predicted, Apple unveiled at its Worldwide Developer Conference a new design language for all its platforms centred around a material that they call Liquid Glass. I have some personal reflections about my time as an iOS platform app developer during the iOS 7 transition, and some thoughts about what the new design language may mean for the remaining independent developers whose businesses have not been destroyed by the App Store’s race to the bottom.

(I will not speak much here about the merits (or lack thereof) of the new design language. There is a lot to say and critique there, but there’s also some reason for hope.)

2025 6 12 https://www.jonmsterling.com/01BZ/ 01BZ /01BZ/ Flat design was about de-skilling app development

If you believe that the purpose of a system is what it does, the purpose of the iOS 7 redesign was to de-skill app development. Admittedly this sounds like a conspiracy theory that ignores Apple designers’ stated motivations, but my experience is that whenever there is a business case for something, that thing will simply happen and those involved in the transition tend to explain it to themselves in ways that flatter their sensibilities—a macrocosm of the epiphenominalist hypothesis for the world of business.

The economic context of the transition, returning to the early 2010s, is that Apple’s native platforms were losing ground to (objectively terrible for users) cross-platform alternatives in large part because of the exorbitantly high cost of designing platform-native apps to the standard set in the visually and materially rich design language of iOS 6 and below. Think about that terrible “web view inside an app” thing that your phone provider makes you use in which scrolling is broken and back-buttons are dangerous to press, and which constantly logs you out in the middle of a task, or stalls on a 10-factor authentication labyrinth, or charges your credit card twice due to a lag in responding to a confirmation button press, and you will know exactly what I mean.

2025 6 12 App development in the iOS 6 era

I was a native mobile app developer in both eras, and I’ll tell you that a serious iOS 6 app would involve hundreds of designer-hours producing meticulous custom graphics for most controls—designed to be thematically harmonious with the system appearance, but customised to delight and surprise: think wooden navigation bars with inset text that looks like it was carved with a router. After this artwork was produced (naturally at 1x and 2x resolution, as we were still in the throes of the Retina transition!), the engineers take ahold of it and begin overriding the -drawRect: methods of many views, which was often non-trivial due to the need to change the behaviour of views managed deep within system classes.

2025 6 12 App development post iOS 7

By way of contrast, designing an app post iOS 7 is considerably less expensive: there are essentially no custom graphics at all, and the only thing the designers are doing is choosing colours and fonts to “highlight the brand”. If there are custom controls, they can be drawn without an expensive designer’s intervention, as in nearly all cases, these are just ugly buttons with slightly non-standard shapes that someone with no skills at all easily can draw in Quartz—or . Certainly there is no engineer sweating over pixels and perfecting the custom animations that support the delightful illusion of material.

2025 6 12 What did “centering the content” achieve?

I have no doubt that behind Jony Ive’s prattling about “centering the content”, which Alan Dye has taken to new extremes more recently, was an actual business case that Apple considered to be of existential importance: if the cost of native application development is not lowered dramatically, native application development will (for all intents and purposes) cease. It is not lost on me that Apple’s de-skilling strategy had the exact opposite of the likely intended effect: there has never been as many non-native apps on Apple platforms as there are today, and I believe there are two reasons for this.

With the advent of Apple Silicon, performance is no longer a strong differentiator for native apps. Many Electron apps (including Visual Studio Code) perform better than native alternatives.
In the era of flat design, in which intricate and materially rich design has been “cancelled”, visual beauty and detail are no longer strong market differentiators for native apps, nor is respect for platform-specific functionality (like proxy icons on the Macintosh!) that is increasingly de-emphasised in Apple’s native toolkits.

2025 6 12 https://www.jonmsterling.com/01BY/ 01BY /01BY/ Liquid Glass is a gift to the indies

I was listening to ’s and one thing that struck me was Marco Arment’s prescient comment that essentially no corporate apps besides Apple’s will adopt it. There are three reasons for this:

Large corporations have gotten used to treating Apple’s decade-long lack of design as a blank slate on which to paint their “brand”. Suppressing the “brand” to unify with the system appearance is a complete non-starter in the corporate world. If you suggest something like that, you will be laughed out of the room.
Most smaller corporate apps were designed and built by consultants rather than in-house, and no small company will be able to justify dropping an additional $200K+ on an app refresh.
Most corporate apps are using some unwieldy cross-platform toolkit like React or Flutter anyway (enough said).

I think the Liquid Glass design presents an opportunity for independent app developers to differentiate themselves from the competition in ways that have not been possible since before iOS 7. The return of texture and depth and active light and subtle animation means that those who treat app development as craft will be able to create vastly different experiences from those created by consultants or even corporate in-house teams whose business motives do not include platform integration or, indeed, delight. (Not all is rosy: the changes to icon dimensions and composition represent a new de-skilling manoeuvre by Apple—but for users, it is hard to say that this is worse than the present dystopia of soulless glyphs.)

These prospects for craft seem not to depend on whether the Liquid Glass design is actually good and accessible for users—it is just complex enough that (good or not) it will lead to the kind of differentiation that we had on both iOS and the Mac OS X platforms in the old days—when an app that was either non-native or poorly crafted (usually both) stood out like a sore thumb in ways that regular users could notice just by touching a control and seeing what happens when you move things around, or finding they can select text that should not selectable, or scrolling to reveal incorrect insets and content boundaries, etc. Attempts to replicate the Liquid Glass material using web technologies will likely lead to stuttering scrolling and drained batteries, which (again) regular users will be able to notice.

So, whilst I’m shaken by the potential for a further degraded user interface on the Macintosh, I’m more optimistic than I thought I would be about the prospects for independent Apple platform application development in the next ten years. I’m also not certain what this means for —I need to experiment with Liquid Glass to better understand its strengths and weaknesses before returning to that project.

2025 4 25 https://www.jonmsterling.com/01AS/ 01AS /01AS/ We have AI at home…

On Tuesday, I travelled by train to Sheffield to take part in the meeting, where I would be about my that compares partial map classifiers with Sierpiński cones in synthetic (domain/category) theory, which I .

2025 4 25 https://www.jonmsterling.com/01AT/ 01AT /01AT/ A pleasant surprise

I was preparing for my chalk talk when I realised that I could not remember the details of the proof of the main result and they couldn’t really be reconstructed from the abbreviated proof in the paper.

Luckily, I had actually formalised this result in Agda! I did not mention my formalisation in the paper because I do not think of formalisations as scientific contributions except in certain cases (that is a conversation for another day). But I did indeed formalise it because the proof was subtle enough that I needed computerised assistance back when I proved it the first time. The result I obtained was frustratingly weak, and seemed to require some annoying side conditions in order to go through; the formalisation helped me be certain that these side conditions were in fact sufficient.

Anyway, I was messing around with the code and what I realised was that I had missed a trick back then: one of the side conditions was actually unnecessary, and it seems kind of likely that the other one is unnecessary too. I am certain I would not have noticed this if I hadn't had the proof assistant, which made it easy for me to try something out and see if it worked. I should have time to update the paper to claim the strong result prior to the camera-ready deadline next month.

2025 4 25 https://www.jonmsterling.com/01AU/ 01AU /01AU/ Arise, symbolic AI!

There is a lot of discussion lately of the impact that some current machine learning techniques, marketed as “Artificial Intelligence”, can have on formalisation of mathematics in proof assistants. Some of the members of the mathematical community have gone all in on this trend (is it a requirement of scientific fame and esteem that you begin to cause trouble in areas of research that you know nothing about?), but I think that evaluating LLMs on Olympiad questions is really missing the point of what computers can do to assist mathematicians. Olympiads are a good fit for LLMs, because kids who participate in Olympiads are behaving much more like LLMs than human mathematicians—the mathematics Olympiad is the ultimate feat of pattern-recognition without understanding, and they are certainly a good fit for the Might Makes Right approach being taken within AI today.

Agda (and Lean and Rocq and Isabelle) are “Artificial Intelligences” in the most progressive sense—they augment the limited context that a human can store in their mind at once, and are nimble tools for working mathematicians to check and verify their ideas, and (most importantly) they do not proceed by creating a fetish of illusion and misdirection that deceives the public. Their capabilities are limited, but well-circumscribed. I think often about how important it is to know in a definite sense what a tool can and cannot do, and I increasingly think that this is actually part of what makes something a tool. Some of my colleagues have compared LLMs to calculators, in order to make the case that we should get ready for them to be used as everyday tools; but LLMs are not simply tools in the sense that a calculator is a tool.

2025 4 17 https://www.jonmsterling.com/01AH/ 01AH /01AH/ Putting Mac OS X Tiger back to work

Over the Christmas holiday, I bought an iMac off eBay for £50. Why was it so cheap? Because it is a 2006 model firing on a single gigabyte of RAM with an Intel Core 2 Duo processor, running Mac OS X Tiger. When I was a kid, I dreamed of having a computer like this—for me, the white plastic era will always be peak Apple design, and Tiger’s version of Aqua was the most polished and refined form of the design language that they ever managed to produce. My first Macintosh was a white polycarbonate unibody MacBook running Leopard—and at the time I greatly regretted having just missed the Tiger era for the gaudy and overly darkened feel of Leopard with its sinister-coloured window controls. I did not know at the time how much worse things would get…

My excuse for purchasing this machine was that I “needed” to run Mac OS X Tiger as “research” for my experimental project , which imagines how the Aqua design language could have evolved if it had been allowed to. But really, I wanted to relive my rare trips to Apple retailers as a kid, where I would spend minutes doing nothing but just moving the scrollbar while watching its stationary “wave” texture, or highlighting buttons to see them burst with blue radiance.

(I spoke about many of the topics covered in this post in my appearance on the hosted by Fredrik Björeman: .)

2025 4 17 https://www.jonmsterling.com/01AI/ 01AI /01AI/ Day One: what can you do 19 year old iMac?

When the delivery came, I took the machine gingerly out of its battered and taped over original packaging and turned it on to a feeling of great excitement, which quickly gave way to loss and melancholy: so much of what computers are “for” involves the World Wide Web, and the forced transition to HTTPS/TLS has stolen access to the Web from users of many working computers of access to the Web (unless they gut the machine by downgrading to a modern but severely less polished operating system, like Linux). The old Intel Macs are a prime example of this loss—although some to enable safe access to the Web for PowerPC machines, older Intel Macintoshes have received comparatively less attention. Capitalist forced obsolescence comes to all, however, and there will no doubt come a time when the “necessary” “security” routines will simply not be executable with the kinds of hardware resources that could be mustered in 2006, no matter the system architecture. After playing around and finding much of the functionality crippled due to being barred from the Internet, I had to ask myself, What should I even do with this thing?

The iMac lay dormant in my room for the next few months while I figured out an answer to that question.

2025 4 17 https://www.jonmsterling.com/01AJ/ 01AJ /01AJ/ With a little love, everything has a use

Last week I finally realised that there is a lot I can still do with this machine. I turned it on when I had a bit of free time, and found that it remains very snappy—software opens instantly without hanging, and in fact the built-in programs are significantly less bug-ridden than they were in subsequent versions of Mac OS X and its (unworthy) successor macOS. To put this into perspective, the “outdated” iMac’s performance was far better than that of my last Intel iMac from 2020 with sixteen times the RAM and several times the processor cores.

It is well-known that hardware capabilities get better and better each year, but this did not translate into improved performance for users until after the Apple Silicon transition—when the hardware improvement was so great that it was able to outpace the deathmarch of inefficient software, for a time. Don’t worry, the “transition to AI” is going to destroy all those gains soon and we’ll be back where we started.

But I digress. Even if you can’t use the Web, there are many things that a 19 year old iMac running Mac OS X Tiger is better at than a more recently manufactured machine. For example, Tiger was the last version of Mac OS X in which Preview.app (the PDF and image viewer) had a working search interface; from the subsequent version (Leopard) all the way until the present day, searching is somehow both too fuzzy and not fuzzy enough, and there seems to be no combination of quotation marks that will lead to reasonable results appearing in the search pane. (Same with Mail.app, which has somehow got even worse in the past year; you can’t connect to email on such an old machine anyway, so the point is moot.)

Similarly, iTunes 7 was the functional peak for Apple’s music management and playback software (although iTunes 6 was visually superior), and people who have only used Apple’s current “Music” app will not be able to understand what they are missing. Likewise, the version of Finder shipped with Tiger was the most polished and least buggy version they ever produced; it is really amazing to switch back and forth between macOS 15.3 and Mac OS X 10.4, and find that most of the bugs or usability problems I have encountered on a daily basis for the past decade or so are actually regressions.

2025 4 17 https://www.jonmsterling.com/01AK/ 01AK /01AK/ The perfect music and PDF reading workstation

So I transferred my music and PDF libraries to the iMac—this was easy to do by creating a local WiFi network from the iMac, a functionality that has been removed in macOS(!). Indeed, modern macOS has replaced some (but not all) aspects of this functionality with what is called “Internet Sharing”, but this feature does not work reliably and in many cases the needful functionalities are unpredictably grayed out and disabled without any message explaining why. Death by a thousand papercuts... But I digress: I set up a local WiFi network with a file server easily using the old System Preferences application (don’t get me started on the bizarre redesign of System Settings introduced in macOS Ventura), and easily transferred everything I wanted to the iMac and then I was off to the races.

I listen to music and study papers on this machine, and it gives me so much joy to use this masterpiece of practical industrial design every day—I even write referee reports on it using an ancient version of , a venerable piece of software that I have to say has not improved much in the past two decades. After installing a copy of (don’t worry, I own a license for and you should too!), I find myself creative writing in my free time like it’s 2006.

2025 4 17 https://www.jonmsterling.com/01AL/ 01AL /01AL/ What about my iPod Classic?

Some of you may be aware that I use an iPod Classic every day. The thing is a godsend—the best mobile device I own. I bought it with a fresh battery and SSD, and the damn battery lasts for months before I have to recharge it. That is the kind of technology that was taken from us and replaced by resource-intensive devices governed by the logic of planned obsolescence. But I have it back—my world is not the same as your world, but it is a world I am glad to have returned to.

Naturally, the first thing I wanted to do was use the iMac as a hub for synchronising the iPod with iTunes. This will work, but what I did not anticipate is that one of my main uses of the iPod is to listen to podcasts, and podcasts cannot be downloaded on the iMac because of the vicious imposition of TLS on all parts of the web that didn’t need it ( really ought to have been called Let’s Kill The Open Web). So I continue synchronising my iPod with my modern day MacBook Air—and it is a testament to Apple’s historical approach to backward compatibility that this is still possible (and even integrated with the otherwise terrible Podcasts app!).

2025 4 17 https://www.jonmsterling.com/01AM/ 01AM /01AM/ Is retrocomputing sustainable?

I constantly feel a pang in the back of my throat when I think about retrocomputing over the long term. We are scrounging around for intact pieces of old technology, but there will come a time when these are too scarce, or when we have really lost the ability to repair them. It is like living in a post-apocalyptic film where a cataclysm has made all manufacturing impossible—but today the cataclysm is not a war or even a virus, but just the simple vicious logic of Capital and a technology industry that has hitched itself to the most ignorant and anti-human trends emanating from the most technologically ignorant people on Wall Street.

Retrocomputing is decidedly not sustainable, in the same sense that living on a stash of canned foods that can no longer be manufactured cannot be sustainable. But also unsustainable is the present day technological treadmill of consumer goods containing precious metals and dangerous chemicals being produced in the billions and sent directly almost directly to the landfill.

I think a better question to ask is whether retrocomputing is progressive. I think that retrocomputing can be progressive insofar as it is part of a practice of looking forward—how can we build sovereign technology that respects constrained resources as well as users of different abilities, and cannot be taken away or made useless by Capital and the irrational whims of the stock market. Such a project must have a significant design component, and this cannot be done amateurishly; looking to the iconic design languages of the past for inspiration and education is, then, deeply progressive in this environment.

The tragedy of Mac OS X is not that Apple took it away and replaced it with something inferior: the tragedy is that free software communities have never managed to produce something even remotely approaching its level of fit and finish. Volunteer projects do not deserve my ire, which I reserve for our economic system in which nearly all competent design work is confined to corporate environments, and then wiped away when the wind shifts.

2025 4 17 https://www.jonmsterling.com/01AN/ 01AN /01AN/ Bring back the joy in computing!

Forget your troubles and find something that makes you smile, and reminds you of what is possible when a team brimming with talent comes together to build something beautiful.

Write to me with any joyful and quirky projects, hardware or software, that you would like to share.

2025 4 6 https://www.jonmsterling.com/019W/ 019W /019W/ Ventriloquy of the Mid-Century Man

Moments before leaving for a meeting in Barbados with my colleagues and , I hastily loaded ’s onto my Kindle for airplane reading. I really want to love “hard science fiction” — I spent months with ’s Foundation in 2022 — but I must admit that I am struggling with the limitations of this genre.

In her , questioned the hard/soft divide in science fiction, pointing out that the distinction seems to be based on authors and readers of sci-fi arbitrarily deciding which of the sciences are “Science” and which aren’t.

The “hard”–science fiction writers dismiss everything except, well, physics, astronomy, and maybe chemistry. Biology, sociology, anthropology—that’s not science to them, that’s soft stuff. They’re not that interested in what human beings do, really. But I am. I draw on the social sciences a great deal. I get a lot of ideas from them, particularly from anthropology. When I create another planet, another world, with a society on it, I try to hint at the complexity of the society I’m creating, instead of just referring to an empire or something like that.

There is a good point here, and Le Guin’s attention to social sciences and anthropology and history was indeed a strength, but was also preoccupied with the same throughout Foundation. So what is the difference, aside from the fact that Foundation was written by a man? Well, it cannot be unrelated to being a man, but perhaps we will get to the point sooner by referring to a specific type of man — the Mid-Century Man.

The Mid-Century Man has no time for the usual literary rules governing the development of characters and their arc of growth. He is interested in science and humanity, yes, but through the lens of his own avatar whom he transplants directly from our world of his time into the futuristic world of his story. This hapless transplant finds himself, with no volition, speaking the author’s words through his own lips… He hears his own voice prattle on about “the last refuge of the incompetent” and can do absolutely nothing about it.

The inclination of the Mid-Century Man to spill no ink that might distract from his speculations (flavoured with thematic orientalism, as in the case of ) is not limited to hard science fiction — Herbert falls prey to the same tendancy in Dune, which I adore, but he is saved by his mastery of the tragedy-form: indeed, we are already weeping with Paul over Chani’s fate a quarter of the way through Dune Messiah.

In contrast, there is little that is compelling about the characters of , nor those of 2001: A Space Odyssey, and the same holds for ’s Foundation. The mostly one-dimensional characters of these works are foils for the authors’ speculative ideas about technology or “future history”, and a shorter work could have been written that omits the ventriloquy act entirely and just directly makes the point the author so strongly wished to make.

Will be the last “hard science fiction” book I read? Surely not… There is still much to enjoy and marvel over — and there is a true sense in which Clarke’s elevator to Heaven is compelling enough to outweigh even the worst of his flimsy characterisation.

2025 3 23 https://www.jonmsterling.com/019O/ 019O /019O/ and : an imperfect conclusion

Two months ago I Harry Harrison’s delightful , a piece of speculative fiction that considers a world in which the Cretaceous–Paleogene extinction event never occurred, and one species of dinosaur went on to evolve over the subsequent 66 million years into a highly sophisticated society stratified by uneven linguistic abilities. The most intelligent stratum of this species calls itself Yilanè, which is at once an ethnonym and a classification of linguistic ability.

In Harrison’s Eden, reptiles of all kinds dominate Africa (Entoban in Yilanè) whereas mankind (Tanu and Sasku in their languages) and other mammals live in North America (Gendasi in Yilanè), confined at first mainly to its colder northern reaches. The icy winds of climate change blow across Entoban and northern Gendasi alike, which leads mankind to migrate south just as the Yilanè begin to colonise southern Gendasi. The encounter of Tanu and Yilanè is predictably violent—when the Tanu encounter the torpid Yilanè males basking on the birthing beaches of Alpèasak (the new Yilanè colony in our own Florida), they slay them immediately and spark a war of the species that shapes the entire trilogy.

That is the premise of , the first book of the Eden Trilogy, which tells the story of a young Tanu man named Kerrick who is kidnapped and raised by Yilanè in Alpèasak. The next two books, and , concern the continuing struggle of Tanu and their Sasku allies to survive in the South while being pursued by a genocidal Yilanè expedition, and of the minority faction of Yilanè called the Daughters of Life who aim to establish a peaceful city in South America.

Although and share the charm of the , they do not quite live up to its promise. Without revealing anything that might spoil the Trilogy (which I do still recommend), there are several imperfections that give a reader the distinct impression of rushed writing—especially in .

Throughout the toponyms Gendasi and Entoban are frequently mixed up, leading to a great deal of confusion. For example, Vaintè says “But Entoban is large, most of it unknown to us, warm and filled with good meat” but this could only refer to Gendasi in the context of the novel. I am apparently not the only person to notice this oversight, as can be seen by perusing the .
Relatedly, there is no explicit toponym for South America, which plays an important role in the story. Indeed, South America appears to be referred to as part of Gendasi, which makes sense considering the land bridge that joins it to North America, but in that case it is surprising that Gendasi is glossed as North America rather than The Americas.
Several interesting story-lines appear to be dropped on the floor. What of the scientist Ambalasei’s fate, and will the radical city Ambalasokei survive?

Perhaps what was needed most was more time. Despite these imperfections, I don’t regret my time in Eden—and I dream of returning there one day.

2025 1 25 https://www.jonmsterling.com/015W/ 015W /015W/ : you can judge a book by its cover

I have found that one could do far worse than perusing the subreddit and picking things to read at random. This week I chose based on its outstanding cover art, and it turned out to be a very enjoyable read.

The premise is that the impact causing the mass extinction of non-avian dinosaurs never occurred, and neolithic Man unknowingly shares the Earth with a very advanced and highly stratified female-dominated race of intelligent lizards whose males have two penises (don’t ask me why! apparently this is common in nature). When climate change threatens both the human and lizard civilisations, they are forced to migrate and, in doing so, discover each other’s presence and clash violently.

Some aspects of the story are a little old-fashioned—I barely need to discuss the instinctive disgust and “racial hatred” felt by the humans and the lizards for each other, but the narrative ultimately challenges this feeling by revealing each culture to be civilised in its own terms, if not in terms that would lead to a cooperative multi-species stewardship of the Earth.

The initial inability of the intelligent lizards to fathom that the “filthy ustuzou”—the lizard phrase for humans and other mammalian creatures—may be sentient and even conscious entities of moral force is challenged by their experience, and by a religious extremist trend within their society that recognises the intrinsic value of all forms of conscious life. In a time where current science continues to reveal, bit by bit, the overconfident ignorance and moral wickedness of those who have insisted that the “lower forms of life” on our own Earth are without sentience, value, moral force (the octopus, the elephant, etc.), it is hard to read this without thinking about our own “filthy ustuzou” and the future consequences of our vicious behaviour toward them.

2024 11 17 https://www.jonmsterling.com/013G/ 013G /013G/ We must fund intellectual infrastructure for the long term

How can we effectively and sustainably fund the intellectual infrastructure for generational human advancement? Intellectual infrastructure—which I would define to encompass the tools used to develop, teach, and learn science—is mistreated by essentially all extant funding models that I am aware of.

To be clear, when I speak of “tools” I mean not only software (like , , , , , , or )—but also more broadly intellectual tools like the and whose main aim is not necessarily to contribute new ideas but to consolidate what is known in order to lay the groundwork for the future.

2024 11 17 https://www.jonmsterling.com/013H/ 013H /013H/ Funding intellectual infrastructure using research grants

In a traditional research grant proposal, one describes a scientific vision instrumented by several key deliverables that can be achieved in anywhere from the space of a year to several years.

There are three main problems with research grants as far as the funding of intellectual infrastructure.

2024 11 17 https://www.jonmsterling.com/013J/ 013J /013J/ Inadequate timescale of research grants Problem

Research grants are usually time-limited (for example, five years tends to be on the longer side for grants), whereas the sustainable intellectual infrastructure that can actually be safely adopted in critical areas must be funded stably on the scale of decades rather than years.

2024 11 17 https://www.jonmsterling.com/013K/ 013K /013K/ Unquantified impact of intellectual infrastructure Problem

Funding is, as a rule, devoted to projects that promise (1) significant and (2) essentially predictable intellectual contributions. Intellectual infrastructure typically does not yield a direct intellectual contribution, but when neglected for too long can impede these intellectual contributions in invisible ways. Funders of research grants are, however, reluctant to fund what they cannot quantify.

2024 11 17 https://www.jonmsterling.com/013L/ 013L /013L/ “implicit” production of intellectual infrastructure Problem

Because intellectual infrastructure is nearly always developed by a combination of volunteer labour and student labour, funders are reluctant to devote resources to what they anticipate will be done anyway without their intervention—or, given that many PhD studentships are sponsored by such bodies, there may be a perception that the existing model is already adequately funding the production of intellectual infrastructure.

2024 11 17 https://www.jonmsterling.com/013I/ 013I /013I/ Funding intellectual infrastructure on the FRO model

An interesting and encouraging development is the emergence of the model, which recognises and aims to solve and . For example, the most excellent describes their mission as follows:

We are excited to share the news of the Lean Focused Research Organization (FRO)! A new nonprofit dedicated to advancing the Formal Mathematics revolution, we aim to tackle the challenges of scalability, usability, and proof automation in the . Our 5-year mission is to empower Lean towards self-sustainability. Please check our for details. ()

I think that this is a great development indeed, and it will be good for Lean and therefore good for our community. But I think that the FRO model cannot address —the inadequate timescale. Even the reference to “self-sustainability” in the mission statement suggests an assumption that there is anything “self-sustaining” whatsoever about the infrastructure that makes scientific and intellectual progress possible.

It is one thing (a very good thing indeed) to solve usability problems and address pain points that prevent institutional and industrial adoption of intellectual tools in the near term. I have every reason to believe that the will be be very successful in this area (and they have made great strides already). An FRO naturally does not aim to solve all problems, and we should certainly not judge them on that unrealistic basis; but we should use the experience to better understand where the gaps are in our current models for the funding and production of intellectual infrastructure.

2024 11 17 https://www.jonmsterling.com/013M/ 013M /013M/ Problems with the bimodal theory/practice cycle

The bimodal “practice—then theory—then practice” model presupposed by the dichotomy between and neglects the dialectical interplay between theory and practice, in ways that can have real consequences for human intellectual advancement. Indeed, it seems to me that the model of dividing funding between traditional research grants and FROs is based on the idea that progress occurs in a two step cycle:

First the “real science” reaches the point where it is strong enough that it could be applied in a practical way, but this application would be impossible to achieve within the research grant funding model.
So an FRO is spun up to pick up where the science left off, and develop the application to the point that it can be practically adopted in institutions and industry.
(Repeat!)

I do not think that the cycle above can lead to sustainable intellectual infrastructure, which requires (1) a more responsive interplay between theory and practice, and (2) long-term continuity of lessons and advances that cannot be achieved on the current model of “once or twice per decade post-mortems” in the form of successive research grant and FRO proposals.

2024 11 17 https://www.jonmsterling.com/013N/ 013N /013N/ Unsustainable tool enhancement is a barrier to progress

I am going to say something strange, and I hope that I will not be misunderstood.

By concentrating an immense amount of funding into “developing the application” without regard for the evolving theoretical basis and sustaining the development over a timescale of decades, we can unintentionally and disastrously raise the bar for high-impact advances by increasing product inertia in unsustainable ways. For example, I would estimate that it costs upwards of £1–2M to create an excellent IDE—so when a five year program results in an excellent IDE for a specific language along the understandably narrow mission of an FRO, the funders may have unintentionally raised the financial baseline for any future transformative impact by at least £1M.

We do not need to make any guesses or conjectures as to whether increased inertia is a likely outcome of these new funding models—as this has already occurred, incontrovertibly, as a consequence of the existing industrial-adoption-based model of production of programming languages, where a large corporation can put millions of funding into a language of relatively low quality (and thus raise significantly the usability of the tooling) to the point that making a competently designed language meet the minimum standard for adoption would cost far more than is possible to fund on the academic model.

Obviously the solution to this problem is not to slow the march of tool enhancement—quite the opposite. But we must find a way to achieve these enhancements without making future improvements of a more radical nature prohibitively expensive. I think that the key to unlocking the long-term sustainability of intellectual tool enhancement to develop a unified and responsive theory/practice funding model that can exercise leadership on the scale of decades.

2024 11 17 https://www.jonmsterling.com/013O/ 013O /013O/ A call to action for a sustainable theory/practice spiral

I am calling for the endowment and funding of independent research institutions on a permanent or decades-long basis, whose mission is not bound to a specific tool (e.g. Lean or Coq or ) but rather to a set of scientific principles and society-level goals to be achieved on a generational timescale.

It may seem as though open-ended endowments of this kind are a pipedream, but I would draw your attention to the example of the at established in 2021 by a gift from billionaire Charles Hoskinson; although the Center is mainly focused on Lean, I see no reason why institutions of this nature could not be endowed on a permanent basis with a broader technical approach.

For future endowments, it will be important to place an emphasis on the funding of permanent staff over the funding of research students and postdocs—whose near-term career needs are often in deep and irreconcilable conflict with the mission of an institution that aims to make generational social impact. If you have resources that could be brought to bear on such an ambitious project, or are connected with someone who does, please get in touch with me ([email protected]). It is time for us to build and sustain the scaffolding for the next three centuries of Science.

2024 11 17 https://www.jonmsterling.com/013P/ 013P /013P/ Acknowledgments

My sincere thanks to for his comments and encouragement.

2024 3 10 https://www.jonmsterling.com/00QB/ 00QB /00QB/ Tips for using plain text email on macOS

I am a proponent of using plain text (or Markdown-formatted) emails rather than HTML-formatted emails. Although there are some legitimate uses for rich formatting in email, I find the well is poisoned: .

2024 3 10 A renaissance for plain text email?

Although plain text email has gone the way of the dinosaur in most parts of society, it remains used among many technical communities (such as the Linux kernel developers, etc.). Recently, has emerged as a free and open source code forge whose workflow is entirely based on plain text email as described on the website. Naturally, this has led to both curiosity and friction as users attempt to adapt to working technology of the 1990s that Capital has made increasingly difficult to interact with using present-day software. As a maintainer of several projects on , I have seen many of my valued collaborators completely fall on the ground when it comes to following the (especially text wrapping) and this is certainly not for lack of trying on their part. The available technological options are just that difficult to use.

The main viable tools for working with plain text email are terminal-based: for example, and and its derivatives. These are, however, a non-starter for most users (even technical users): for example, they are difficult to configure, and integrating with address books from standard email providers is nearly impossible or requires a Nobel prize; a bigger problem is that for most of these tools, one must store passwords in plain text (or use some very complicated authentication scheme that a typical user will be unable to set up).

The only viable modern GUI client recommended by for macOS is . Unfortunately, ’s visible settings do not have any option for wrapping text, which would contradict its viability for plaintext email users. Furthermore, the versions of available for download from its website (both the released version and the prerelease) seem to be incompatible with macOS Sonoma. By doing a bit of spelunking, I have found that both these problems can be solved (at least for now).

2024 3 10 https://www.jonmsterling.com/00QD/ 00QD /00QD/ Using for plain text email 2024 3 10 Download the prerelease of Step

First, download the latest prerelease binary for from this directory: . I found that the this program can take a very long time to open for the first time, but after this it is speedy.

2024 3 10 Enable format=flowed Step

Second, enable format=flowed using the following completely undocumented command:

defaults write com.freron.MailMate MmFormatFlowedEnabled -bool true

Note that this option is so undocumented that it does not even appear in the list of . There is some discussion of this on the .

Although the standard is very rarely implemented by clients, email sent using this flag will be formatted correctly by .

2023 9 20 2023 12 17 https://www.jonmsterling.com/009F/ 009F /009F/ Day tensors of fibered categories

I have been thinking about monoidal closed structures induced by , which has been considered by as a potential denotational semantics of destructors à la C++. It occurred to me that this construction is, in fact, almost a degenerate case of Day convolution on an internal monoidal category — and this made me realize that there might be a nice way to understand Day convolution in the language of . In fact, many of these results (in particular the relativization to internal monoidal categories) are probably a special case of Theorem 11.22 of .

2023 9 20 https://www.jonmsterling.com/009A/ 009A /009A/ The “Day tensor” of Definition

Let and be two over a semimonoidal category . We may define the “Day tensor” product of and to be the following fibered category over :

Note that refers to the left adjoint of base change for , .

2023 9 20 https://www.jonmsterling.com/009C/ 009C /009C/ Stepping through the Exegesis

To understand the construction of the , we will go through it step-by-step. Let and be two over a semimonoidal category .

Given that and are displayed over , we may restrict them to lie the two disjoint regions of by along the two projections:
{} m m}{ \path coordinate (#1nw) ++(#2,-#3) coordinate (#1se) coordinate (#1sw) at (#1se -| #1nw) coordinate (#1ne) at (#1nw -| #1se) ; \path[spath/save = #1north] (#1nw) to (#1ne); \path[spath/save = #1west] (#1nw) to (#1sw); \path[spath/save = #1east] (#1ne) to (#1se); \path[spath/save = #1south] (#1sw) to (#1se); } \usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd} \usetikzlibrary{matrix,arrows} \usetikzlibrary{backgrounds,fit,positioning,calc,shapes} \usetikzlibrary{decorations.pathreplacing} \usetikzlibrary{decorations.pathmorphing} \usetikzlibrary{decorations.markings} \tikzset{ desc/.style={sloped, fill=white,inner sep=2pt}, upright desc/.style={fill=white,inner sep=2pt}, pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pullback 45/.style = { append after command={ \pgfextra{ \draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, ne pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm); } } }, sw pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm); } } }, dotted pullback/.style = { append after command={ \pgfextra{ \draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, muted pullback/.style = { append after command={ \pgfextra{ \draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pushout/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm); } } }, between/.style args={#1 and #2}{ at = ($(#1)!0.5!(#2)$) }, diagram/.style = { on grid, node distance=2cm, commutative diagrams/every diagram, line width = .5pt, every node/.append style = { commutative diagrams/every cell, } }, fibration/.style = { -{Triangle[open]} }, etale/.style = { -{Triangle[open]} }, etale cover/.style= { >={Triangle[open]},->.> }, opfibration/.style = { -{Triangle} }, lies over/.style = { |-{Triangle[open]} }, op lies over/.style = { |-{Triangle} }, embedding/.style = { {right hook}-> }, open immersion/.style = { {right hook}-{Triangle[open]} }, closed immersion/.style = { {right hook}-{Triangle} }, closed immersion*/.style = { {left hook}-{Triangle} }, embedding*/.style = { {left hook}-> }, open immersion*/.style = { {left hook}-{Triangle[open]} }, exists/.style = { densely dashed }, } \newlength{\dontworryaboutit} \tikzset{ inline diagram/.style = { commutative diagrams/every diagram, commutative diagrams/cramped, line width = .5pt, every node/.append style = { commutative diagrams/every cell, anchor = base, inner sep = 0pt }, every path/.append style = { outer xsep = 2pt } } } \tikzset{ square/nw/.style = {}, square/ne/.style = {}, square/se/.style = {}, square/sw/.style = {}, square/north/.style = {->}, square/south/.style = {->}, square/west/.style = {->}, square/east/.style = {->}, square/north/node/.style = {above}, square/south/node/.style = {below}, square/west/node/.style = {left}, square/east/node/.style = {right}, } \ExplSyntaxOn \bool_new:N \l_jon_glue_west \keys_define:nn { jon-tikz/diagram } { nw .tl_set:N = \l_jon_tikz_diagram_nw, sw .tl_set:N = \l_jon_tikz_diagram_sw, ne .tl_set:N = \l_jon_tikz_diagram_ne, se .tl_set:N = \l_jon_tikz_diagram_se, width .tl_set:N = \l_jon_tikz_diagram_width, height .tl_set:N = \l_jon_tikz_diagram_height, north .tl_set:N = \l_jon_tikz_diagram_north, south .tl_set:N = \l_jon_tikz_diagram_south, west .tl_set:N = \l_jon_tikz_diagram_west, east .tl_set:N = \l_jon_tikz_diagram_east, nw/style .code:n = {\tikzset{square/nw/.style = {#1}}}, sw/style .code:n = {\tikzset{square/sw/.style = {#1}}}, ne/style .code:n = {\tikzset{square/ne/.style = {#1}}}, se/style .code:n = {\tikzset{square/se/.style = {#1}}}, glue .choice:, glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool}, glue~target .tl_set:N = \l_jon_tikz_glue_target, north/style .code:n = {\tikzset{square/north/.style = {#1}}}, north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}}, south/style .code:n = {\tikzset{square/south/.style = {#1}}}, south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}}, west/style .code:n = {\tikzset{square/west/.style = {#1}}}, west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}}, east/style .code:n = {\tikzset{square/east/.style = {#1}}}, east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}}, draft .meta:n = { nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}}, sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}}, se = {\__jon_tikz_diagram_fmt_placeholder:n {se}}, ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}}, north = {\__jon_tikz_diagram_fmt_placeholder:n {north}}, south = {\__jon_tikz_diagram_fmt_placeholder:n {south}}, west = {\__jon_tikz_diagram_fmt_placeholder:n {west}}, east = {\__jon_tikz_diagram_fmt_placeholder:n {east}}, } } \tl_set:Nn \l_jon_tikz_diagram_width { 2cm } \tl_set:Nn \l_jon_tikz_diagram_height { 2cm } \cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 { \texttt{\textcolor{red}{#1}} } \keys_set:nn { jon-tikz/diagram } { glue~target = {}, } \cs_new:Nn \__jon_tikz_render_square:nn { \group_begin: \keys_set:nn {jon-tikz/diagram} {#2} \bool_if:nTF \l_jon_glue_west { \node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } { \node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$}; \node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$}; \draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw); \node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } \group_end: } \NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{ \__jon_tikz_render_square:nn {#1} {#2} } \NewDocumentCommand\DiagramSquare{D<>{}O{}m}{ \begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)] \__jon_tikz_render_square:nn {#1} {#3} \end{tikzpicture} } \ExplSyntaxOff ]]>}, west/style = {|->,exists}, north/style = {->,exists}, south = \pi _1, height = 1.5cm, } \qquad \DiagramSquare { nw = \pi _2^*F, sw = B\times B, se = B, ne = F, east/style = {|->}, west/style = {|->,exists}, north/style = {->,exists}, south = \pi _2, height = 1.5cm, } ]]>
The above can be seen as cartesian lift in the 2-bifibration of fibered categories over .
Next, we took the fiber product of fibered categories, giving a vertical span over : Of course, this corresponds to pullback in or cartesian product in .
Finally, we take a along the tensor functor to obtain the :
{} m m}{ \path coordinate (#1nw) ++(#2,-#3) coordinate (#1se) coordinate (#1sw) at (#1se -| #1nw) coordinate (#1ne) at (#1nw -| #1se) ; \path[spath/save = #1north] (#1nw) to (#1ne); \path[spath/save = #1west] (#1nw) to (#1sw); \path[spath/save = #1east] (#1ne) to (#1se); \path[spath/save = #1south] (#1sw) to (#1se); } \usepackage{tikz, tikz-cd, mathtools, amssymb, stmaryrd} \usetikzlibrary{matrix,arrows} \usetikzlibrary{backgrounds,fit,positioning,calc,shapes} \usetikzlibrary{decorations.pathreplacing} \usetikzlibrary{decorations.pathmorphing} \usetikzlibrary{decorations.markings} \tikzset{ desc/.style={sloped, fill=white,inner sep=2pt}, upright desc/.style={fill=white,inner sep=2pt}, pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pullback 45/.style = { append after command={ \pgfextra{ \draw[rotate = 45] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, ne pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,-.5cm)$) -- ++(-0.3cm,0) -- ++(0,0.3cm); } } }, sw pullback/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (.2cm,.5cm)$) -- ++(0.3cm,0) -- ++(0,-0.3cm); } } }, dotted pullback/.style = { append after command={ \pgfextra{ \draw [densely dotted] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, muted pullback/.style = { append after command={ \pgfextra{ \draw [gray] ($(\tikzlastnode) + (.2cm,-.5cm)$) -- ++(0.3cm,0) -- ++(0,0.3cm); } } }, pushout/.style = { append after command={ \pgfextra{ \draw ($(\tikzlastnode) + (-.2cm,.5cm)$) -- ++(-0.3cm,0) -- ++(0,-0.3cm); } } }, between/.style args={#1 and #2}{ at = ($(#1)!0.5!(#2)$) }, diagram/.style = { on grid, node distance=2cm, commutative diagrams/every diagram, line width = .5pt, every node/.append style = { commutative diagrams/every cell, } }, fibration/.style = { -{Triangle[open]} }, etale/.style = { -{Triangle[open]} }, etale cover/.style= { >={Triangle[open]},->.> }, opfibration/.style = { -{Triangle} }, lies over/.style = { |-{Triangle[open]} }, op lies over/.style = { |-{Triangle} }, embedding/.style = { {right hook}-> }, open immersion/.style = { {right hook}-{Triangle[open]} }, closed immersion/.style = { {right hook}-{Triangle} }, closed immersion*/.style = { {left hook}-{Triangle} }, embedding*/.style = { {left hook}-> }, open immersion*/.style = { {left hook}-{Triangle[open]} }, exists/.style = { densely dashed }, } \newlength{\dontworryaboutit} \tikzset{ inline diagram/.style = { commutative diagrams/every diagram, commutative diagrams/cramped, line width = .5pt, every node/.append style = { commutative diagrams/every cell, anchor = base, inner sep = 0pt }, every path/.append style = { outer xsep = 2pt } } } \tikzset{ square/nw/.style = {}, square/ne/.style = {}, square/se/.style = {}, square/sw/.style = {}, square/north/.style = {->}, square/south/.style = {->}, square/west/.style = {->}, square/east/.style = {->}, square/north/node/.style = {above}, square/south/node/.style = {below}, square/west/node/.style = {left}, square/east/node/.style = {right}, } \ExplSyntaxOn \bool_new:N \l_jon_glue_west \keys_define:nn { jon-tikz/diagram } { nw .tl_set:N = \l_jon_tikz_diagram_nw, sw .tl_set:N = \l_jon_tikz_diagram_sw, ne .tl_set:N = \l_jon_tikz_diagram_ne, se .tl_set:N = \l_jon_tikz_diagram_se, width .tl_set:N = \l_jon_tikz_diagram_width, height .tl_set:N = \l_jon_tikz_diagram_height, north .tl_set:N = \l_jon_tikz_diagram_north, south .tl_set:N = \l_jon_tikz_diagram_south, west .tl_set:N = \l_jon_tikz_diagram_west, east .tl_set:N = \l_jon_tikz_diagram_east, nw/style .code:n = {\tikzset{square/nw/.style = {#1}}}, sw/style .code:n = {\tikzset{square/sw/.style = {#1}}}, ne/style .code:n = {\tikzset{square/ne/.style = {#1}}}, se/style .code:n = {\tikzset{square/se/.style = {#1}}}, glue .choice:, glue / west .code:n = {\bool_set:Nn \l_jon_glue_west \c_true_bool}, glue~target .tl_set:N = \l_jon_tikz_glue_target, north/style .code:n = {\tikzset{square/north/.style = {#1}}}, north/node/style .code:n = {\tikzset{square/north/node/.style = {#1}}}, south/style .code:n = {\tikzset{square/south/.style = {#1}}}, south/node/style .code:n = {\tikzset{square/south/node/.style = {#1}}}, west/style .code:n = {\tikzset{square/west/.style = {#1}}}, west/node/style .code:n = {\tikzset{square/west/node/.style = {#1}}}, east/style .code:n = {\tikzset{square/east/.style = {#1}}}, east/node/style .code:n = {\tikzset{square/east/node/.style = {#1}}}, draft .meta:n = { nw = {\__jon_tikz_diagram_fmt_placeholder:n {nw}}, sw = {\__jon_tikz_diagram_fmt_placeholder:n {sw}}, se = {\__jon_tikz_diagram_fmt_placeholder:n {se}}, ne = {\__jon_tikz_diagram_fmt_placeholder:n {ne}}, north = {\__jon_tikz_diagram_fmt_placeholder:n {north}}, south = {\__jon_tikz_diagram_fmt_placeholder:n {south}}, west = {\__jon_tikz_diagram_fmt_placeholder:n {west}}, east = {\__jon_tikz_diagram_fmt_placeholder:n {east}}, } } \tl_set:Nn \l_jon_tikz_diagram_width { 2cm } \tl_set:Nn \l_jon_tikz_diagram_height { 2cm } \cs_new:Npn \__jon_tikz_diagram_fmt_placeholder:n #1 { \texttt{\textcolor{red}{#1}} } \keys_set:nn { jon-tikz/diagram } { glue~target = {}, } \cs_new:Nn \__jon_tikz_render_square:nn { \group_begin: \keys_set:nn {jon-tikz/diagram} {#2} \bool_if:nTF \l_jon_glue_west { \node (#1ne) [right = \l_jon_tikz_diagram_width~of~\l_jon_tikz_glue_target ne,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (\l_jon_tikz_glue_target ne) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (\l_jon_tikz_glue_target se) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } { \node (#1nw) [square/nw] {$\l_jon_tikz_diagram_nw$}; \node (#1sw) [below = \l_jon_tikz_diagram_height~of~#1nw,square/sw] {$\l_jon_tikz_diagram_sw$}; \draw[square/west] (#1nw) to node [square/west/node] {$\l_jon_tikz_diagram_west$} (#1sw); \node (#1ne) [right = \l_jon_tikz_diagram_width~of~#1nw,square/ne] {$\l_jon_tikz_diagram_ne$}; \node (#1se) [below = \l_jon_tikz_diagram_height~of~#1ne,square/se] {$\l_jon_tikz_diagram_se$}; \draw[square/north] (#1nw) to node [square/north/node] {$\l_jon_tikz_diagram_north$} (#1ne); \draw[square/east] (#1ne) to node [square/east/node] {$\l_jon_tikz_diagram_east$} (#1se); \draw[square/south] (#1sw) to node [square/south/node] {$\l_jon_tikz_diagram_south$} (#1se); } \group_end: } \NewDocumentCommand\SpliceDiagramSquare{D<>{}m}{ \__jon_tikz_render_square:nn {#1} {#2} } \NewDocumentCommand\DiagramSquare{D<>{}O{}m}{ \begin{tikzpicture}[diagram,#2,baseline=(#1sw.base)] \__jon_tikz_render_square:nn {#1} {#3} \end{tikzpicture} } \ExplSyntaxOff ]]>,exists}, east/style = {|->,exists}, west/style = {|->}, height = 1.5cm, width = 3cm, } ]]>
Note: the above does not correspond to composition in except in the discrete case.

Under appropriate assumptions, we may also compute a “” by adjointness.

2023 9 20 https://www.jonmsterling.com/009G/ 009G /009G/ The “Day hom” of fibered categories Definition

Let and be two over a semimonoidal category . We may define the “Day hom” of and to be the following over :

2023920Proof

First of all, we note that the pullback functor has a right adjoint , as is (as any cartesian fibration) a Conduché functor, i.e. an in . We have already assumed that is a Cartesian fibration, and thus so is its ; it therefore follows that is . With that out of the way, we may compute the hom by adjoint calisthenics:

2023 9 20 https://www.jonmsterling.com/009B/ 009B /009B/ The “Day unit” of fibered categories Definition

Let be a monoidal category. We may define a “Day unit” for over to be given by the discrete fibration , which corresponds under the Grothendieck construction to the presheaf represented by .

2023 9 20 https://www.jonmsterling.com/009D/ 009D /009D/ The preserves cartesian fibrations Conjecture

If and are cartesian fibrations over a semimonoidal category , then the is also a cartesian fibration.

I believe, but did not check carefully, that when and are discrete fibrations over a semimonoidal category then the is precisely the discrete fibration corresponding to the (contravariant) Day convolution of the presheaves corresponding to and . Likewise when is monoidal, it appears that the corresponds precisely to the traditional one.

There remain some interesting directions to explore. First of all, the claims above would obviously lead to a new construction of the Day convolution monoidal structure on the 1-category of discrete fibrations on that coincides with the traditional one up to the Grothendieck construction. But in general, we should expect to exhibit both and as monoidal bicategories, a result that I have not seen before.

2023 9 20 https://www.jonmsterling.com/009E/ 009E /009E/ A monoidal bicategory of fibered categories Conjecture

Let be a fibered category. Then the and extend to a monoidal structure on the bicategory of fibered categories over .

is highly non-trivial, as monoidal bicategories are extremely difficult to construct explicitly. I am hoping that ’s ideas involving could potentially help.

2023 9 17 https://www.jonmsterling.com/0094/ 0094 /0094/ On the relationship between and

I have been thinking again about the relationship between and and other approaches to type refinements. One of the defining characteristics of that I thought distinguished it from was the treatment of types: in , types only depend on the “computational” / unrefined aspect of their context, whereas types in are allowed to depend on everything. In the past, I mistakenly believed that this was due to the realizability-style interpretation of , in contrast with ’s gluing interpretation. It is now clear to me that (1) is actually glued (in the sense of q-realizability, no pun intended), and (2) the nonstandard interpretation of types in corresponds to adding an additional axiom to , namely the tininess of the generic proposition.

It has been suggested to me by that this property of may not be desirable in all cases (sometimes you want the types to depend on quantitative information), and that for this reason, graded type theories might be a better way forward in some applications. My results today show that is, in essence, what you get when you relax the ’s assumption that types do not depend on quantitative information. This suggests that we should explore the idea of multiplicities within the context of — as any monoidal product on the subuniverse spanned by closed-modal types induces quite directly a form of variable multiplicity in , I expect this direction to be fruitful.

My thoughts on the precise relationship between the models and Artin gluing will be elucidated at a different time. Today, I will restrict myself to sketching an interpretation of a QTT-style language in STC assuming the generic proposition is internally tiny.

Let be an elementary topos equipped with a subterminal object inducing an open subtopos and its complementary closed subtopos . This structure is the basis of the interpretation of ; if you think of in terms of refinements, then stuff from is “computational” and stuff from is “logical”.

We now consider the interpretation of a language of (potentially quantitative) refinements into . A context is interpreted by an object of ; a type is interpreted by a family ; a term is interpreted as a map such that is the unit of the monad.

So far we have not needed anything beyond the base structure of in order to give an interpretation of types in ’s style. But to extend this interpretation to a universe, we must additionally assume that is internally tiny, in the sense that the exponential functor is a left adjoint. Under these circumstances, the idempotent monad corresponding to the open immersion has a right adjoint , an idempotent comonad.

2023917Proof

As is the exponential functor , its right adjoint is therefore the “root functor” that exhibits as an internally tiny object.

Although lifts to each slice of , these liftings do not commute with base change; this will, however, not be an obstacle for us.

We will now see how to use the adjunction to interpret a universe, either for the purpose of interpreting universes of refinement types, or for the purpose of strictifying the model that we have sketched. Let be a (standard) universe in , e.g. a ; we shall then interpret the corresponding universe of refinements as . To see that classifies -small families of refinements, we compute as follows:

A code amounts to nothing more than a morphism .
By adjoint transpose, this is the same as a morphism .

Thus we see that if is generic for -small families of (arbitrary) types in , then is generic for -small families of type refinements, i.e. types whose context is -modal.

Finally, we comment that the tininess of is satisfied in many standard examples, the simplest of which is the Sierpiński topos .

2023 9 5 https://www.jonmsterling.com/008J/ 008J /008J/ Classifying topoi and generalised abstract syntax

For any small category , -sorted abstract syntax in the functor category where is some 2-monad on ; any such functor denotes a set that is indexed in sorts and contexts (where the 2-monad takes a category of sorts to the corresponding category of contexts). When is a set and is either finite limit or finite product completion, we recover the standard notions of many-sorted abstract syntax; in general, we get a variety of forms of dependently sorted or generalised abstract syntax.

We will assume here that is the free finite limit completion 2-monad; our goal is to study ’s general substitution monoidal structure from the point of view of classifying topoi, building on ’s analogous observations () on the non-symmetric monoidal structure of the object classifier. The topos theoretic viewpoint that we will explore is nothing more than a rephrasing of ’s account in terms of the Kleisli composition in a 2-monad; nonetheless the perspective of classifying topoi is enlightening, as it provides an explanation for precisely what internal geometrical structure one expects in a given topos for abstract syntax, potentially leading to improved internal languages.

For any small category , the category of presheaves corresponds to the classifying topos of diagrams of shape . Following , we shall write for this “affine” classifying topos; under the conventions of op. cit., we may then identify the category of sheaves with the presheaf category .

The universal property of as the classifying topos of -diagrams means that for any topos , a diagram corresponds essentially uniquely (by left Kan extension) to a morphism of topoi . We have a generic -shaped diagram corresponding under this identification to the identity map on . More explicitly, the diagram is the following composite:

Given a morphism of topoi , we may recover the diagram that it classifies as the composite .

In case , then, we have a correspondence between -shaped diagrams of sheaves on and endomorphisms of ; we are interested in representing the compositions of such endomorphisms as a tensor product on the functor category .

In particular, let be two diagrams; taking characteristic maps, we have endomorphisms of affine topoi , which we may compose to obtain ; then, we will define the tensor to be the diagram whose characteristic morphism of affine topoi is . In other words:

To give an explicit computation of the tensor product, we first compute the inverse image of any on representables for . As any left exact functor is the right Kan extension of along , we can conclude that . We will use this in our calculation below, setting .

We are now prepared to compute the tensor product of any .

Finally, we can relate the computation above to that of in terms of coends.

Above, we have written and for the tensoring and cotensoring of over respectively. Thus, the fully pointwise computation is as follows:

Thanks to and for helpful discussions.

2023 8 31 https://www.jonmsterling.com/0084/ 0084 /0084/ Scientific refereeing using

I have long been an enthusiast for , a genre of computer software that deserves more than almost any other to be called an “elegant weapon for a more civilized age”. Recently I have been enjoying experimenting with ’s highly outliner for macOS called , which builds on a lot of his previous (highly impressive) work in the area with a level of fit and finish that is rare even in the world of macOS software. and is well-worth it; watch the or try the demo to see for yourself.

The purpose of outliners is to provide room to actively think; is said to have been unable to think at all without a pen in his hand, and I think of outliners as one way to elevate the tactile aspect of active thinking using the unique capabilities of software. Tools for thinking must combat stress and mental weight, and the most immediate way that outliners achieve this is through the ability to focus on a subtree — narrowing into a portion of the hierarchy and treating it as if it were the entire document, putting its context aside. This feature, which some of my readers may recognize from Emacs org-mode, is well-represented in — without, of course, suffering the noticeable quirks that come from the Emacs environment, nor the sadly adopted by org-mode.

As a scientist in academia, one of the most frequent things I am doing when I am not writing my own papers or working with students is refereeing other scientists’ papers. For those who are unfamiliar, this means carefully studying a paper and then producing a detailed and well-structured report that includes a summary of the paper, my personal assessment of its scientific validity and value, and a long list of corrections, questions, comments, and suggestions. Referee reports of this kind are then used by journal editors and conference program committees to decide which papers deserve to be published.

In this post, I will give an overview of my refereeing workflow with and how I overcame the challenges transferring finished referee reports from into the text-based formats used by conference refereeing platforms like and using a combination of XSLT 2.0 and . This tutorial applies to 1.14; I hope the format will not change too much, but I cannot make promises about what I do not control.

2023829https://www.jonmsterling.com/0089/0089/0089/Refereeing in an outliner

Most scientific conferences solicit and organize reviews for papers using a web platform such as and ; although these are not the same, the idea is similar. Once you have been assigned to referee a paper, you will receive a web form with several sections and large text areas in which to put the components of your review; not all conferences ask for the same components, but usually one is expected to include the following in addition to your (numerical) assessment of the paper’s merit and your expertise:

A summmary of the paper
Your assessment of the paper
Detailed comments for the authors
Questions to be addressed by author response
Comments for the PC (program committee) and other reviewers

Usually you will be asked to enter your comments under each section in a plain text format like Markdown. The first thing a new referee learns is not to type answers directly into the web interface, because this is an extremely reliable way to lose hours of your time when a browser or server glitch deletes all your work. Most of us instead write out our answers in a separate text editor, and paste them into the web interface when we are satisfied with them. In the past, I have done this with text files on my computer, but today I want to discuss how to draft referee reports as outlines in ; then I will show you how to convert them to the correct plain text format that can be pasted into your conference’s preferred web-based refereeing platform.

To start off, have a look at the below, which shows my refereeing template outline in .

2023828https://www.jonmsterling.com/0086/0086/0086/A outline for conference refereeingFigure

As you can see, there is a healthy combination of hierarchy and formatting in a outline.

2023829https://www.jonmsterling.com/008B/008B/008B/Rich text editing in

is a rich text editor, but one that (much like ) avoids the classic pitfalls of nearly all rich text editors, such as the ubiquitous and dreaded “Is the space italic?!” user-experience failure; I will not outline here, but I suggest you check it out for yourself.

2023829https://www.jonmsterling.com/008A/008A/008A/Row types in

One of the most useful features of ’s approach to formatting is the concept of a row type, which is a semantic property of a row that has consequences for its visual presentation. currently supports the following row types:

Plain rows
Heading rows, formatted in boldface
Note rows, formatted in gray italics
Quote rows, formatted with a vertical bar to their left
Ordered rows, formatted with an (automatically chosen) numeral to their left
Unordered rows, formatted with a bullet to their left
Task rows, formatted with a checkbox to their left

My uses several of these (headings, notes, tasks) as well as some of the (highlighting). When I fill out the refereeing outline, I will use other as well — including quotes, ordered, and unordered rows. I will create a subheading under Detailed comments for the author to contain my questions and comments, which I enter in as ordered rows; then I make a separate subheading at the same level for Typographical errors, which I populate with unordered rows. Unordered rows are best for typos, because they are always accompanied already by line numbers. If I need to quote an extended portion of the paper, I will use a quote row.

When working on the report outline, I will constantly be focusing on individual sections to avoid not only distractions but also the intense mental weight of unfinished sections. Focusing means that the entire outline is narrowed to a subtree that can be edited away from its context; this is achieved in by pressing the gray south-easterly arrows to the right of each heading, as seen in .

2023831https://www.jonmsterling.com/008C/008C/008C/From an outline to a plain text referee report

Although we have seen how pleasant it is to use an outliner like to draft a referee report, but we obviously cannot submit a .bike file to a conference reviewing website or a journal editor. Most conference reviewing systems accept plain text or Markdown responses, and so our goal will be to convert a outline into reasonably formatted Markdown.

It happens that ’s underlying format is HTML, so one idea would be to use to process this HTML into Markdown. This would work, except that ’s model is sufficiently structured that it must make deeply idiosyncratic use of HTML tags, as can be seen from the below.

2023831https://www.jonmsterling.com/008G/008G/008G/The source code to a typical outlineListing

Tasks
- read through paper on iPad and highlight
Syntax and semantics of foo bar baz
- Overall merit:
- Reviewer Expertise:
- Summary of the paper
  - Please give a brief summary of the paper
  - This paper describes the syntax and semantics of foo, bar, and baz.
- Assessment of the paper
  - Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.
- Detailed comments for authors
  - Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.
  - Minor comments
    - line 23: "teh" => "the"
    - line 99: "fou" => "foo"
- Questions to be addressed by author response
  - Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.
- Comments for PC and other reviewers
  - Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.
  - In case any one is wondering, I am an expert in foo, but not in bar nor baz.

]]>

The Markdown that would result from postprocessing a outline directly with would be deeply unsuitable for submission. We will, however, use a version of this idea: first we will preprocess the format into more conventional (unstructured) HTML using XSLT 2.0, and then we will use to convert this into Markdown.

2023831https://www.jonmsterling.com/008E/008E/008E/System requirements to convert outlines to Markdown

XSLT 2.0 is unfortunately only implemented by proprietary tools like , developed by Saxonica. Nonetheless, it is possible to freely install Saxon on macOS using Homebrew:

brew install saxon

You must also install , which is also conveniently available as a binary on Homebrew:

brew install pandoc

With the out of the way, we can proceed to prepare an XSLT stylesheet that will convert ’s idiosyncratic use of HTML tags to more conventional HTML that can be processed into Markdown by . The stylesheet bike-to-html.xsl is described and explained in the below.

2023829https://www.jonmsterling.com/0087/0087/0087/An XSLT 2.0 transformer to convert outlines to HTMLListing

We can write convert outlines to reasonable HTML using an XSLT 2.0 stylesheet, bike-to-html.xsl detailed below.

]]>

We will allow several tags to be copied verbatim into the output, as uses these in the same way that idiomatic HTML does.

]]>

leaves behind a lot of empty span elements; we drop these.

]]>

uses ul for all lists; the list type is determined not at this level, but rather by each individual item’s @data-type attribute. To get this data into the HTML list model, we must group items that have the same @data-type and wrap them in an appropriate list-forming element.

To do this, we use XSLT 2.0’s xsl:for-each-group instruction to group adjacent li elements by their @data-type attribute. (It is extremely difficult and error-prone to write equivalent code in the more widely available XSLT 1.0.) We must convert @data-type to a string: otherwise, the transformer will crash when it encounters an item without a @data-type attribute.

]]>

Next, we match each individual li element; the content of a list item is stored in a p element directly under li, so we let the transformer fall thorugh the parent and then format the content underneath according to the @data-type of the item.

]]>

has correctly adopted the optimal and model of hierarchy, in contrast to HTML; this means that the depth of a heading is not reflected in the element that introduces it, but is instead inferred from its actual position in the outline hierarchy. To convert outlines to idiomatic HTML, we must flatten the hierarchy and introduce explicit heading levels; luckily, this is easy to accomplish in XSLT by counting the ancestors of heading type.

]]>

The remainder of the row types are not difficult to render; you may prefer alternative formatting depending on your goals.

]]>

Next, we can use Saxon to convert a outline to idiomatic HTML using the .

cat review.bike | saxon -xsl:bike-to-html.xsl - > review.html

Go ahead and open the resulting HTML file in a text editor and a browser to see the results.

2023831https://www.jonmsterling.com/008F/008F/008F/A transformed to idiomatic HTMLListing

The following is the result of transforming an to idiomatic HTML using an .

read through paper on iPad and highlight

Overall merit:

Reviewer Expertise:

Summary of the paper

Please give a brief summary of the paper

This paper describes the syntax and semantics of foo, bar, and baz.

Assessment of the paper

Please give a balanced assessment of the paper's strengths and weaknesses and a clear justification for your review score.

Detailed comments for authors

Please give here any additional detailed comments or questions that you would like the authors to address in revising the paper.

Minor comments

line 23: "teh" => "the"
line 99: "fou" => "foo"

Questions to be addressed by author response

Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.

Comments for PC and other reviewers

Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.

In case any one is wondering, I am an expert in foo, but not in bar nor baz.

]]>

Next, we will process this HTML file using ; unfortunately, leaves behind a lot of garbage character escapes that are not suitable for submission anywhere, so we must filter those out using sed.

2023831https://www.jonmsterling.com/008H/008H/008H/A transformed to MarkdownListing

The following is the result of converting the of a to Markdown using , with some light postprocessing by sed.

"the" - line 99: "fou" => "foo" ## Questions to be addressed by author response *Please list here any specific questions you would like the authors to address in their author response. Since authors have limited time in which to prepare their response, please only ask questions here that are likely to affect your accept/reject decision.* ## Comments for PC and other reviewers *Please list here any additional comments you have which you want the PC and other reviewers to see, but not the authors.* In case any one is wondering, I am an expert in foo, but not in bar nor baz.]]>

We can compose all these tasks into a one-liner as follows:

2023831https://www.jonmsterling.com/008I/008I/008I/A convenient -to-Markdown script

I have gathered the scripts to convert outlines into Markdown via idiomatic HTML in a where they can be easily downloaded. If you have any improvements to these scripts, please to my ! I am also interested in whether it is possible to write the as equivalent XSLT 1.0, to avoid the proprietary tool. Feel free also to send comments on this post to my public inbox, or discuss with me on .

https://www.jonmsterling.com/0088/0088/0088/Crowdfunding and sponsorship☕

Apart from my day-job at the , I am independently researching and developing software like that you can use to unlock your brain. If you have benefited from this work or the writings on my , please considering supporting me with a .

2023 8 18 https://www.jonmsterling.com/007T/ 007T /007T/ A synthetic proof of HTT 7.2.1.14

states the following result as Proposition 7.2.1.14 of :

Let be an ∞-topos and let be the left adjoint to the inclusion. A morphism in is an effective epimorphism if and only if in is an effective epimorphism in the ordinary topos .

that the proof of this result given by is circular. The result is true and can be recovered in a variety of ways, as pointed out in the . For expository purposes, I would like to show how to prove this result directly in univalent foundations using standard results about -truncations from the .

First we observe a trivial lemma about truncations.

2023 8 18 https://www.jonmsterling.com/007U/ 007U /007U/ Propositions do not notice set truncation of indices Lemma

Any proposition is orthogonal to the map sending to .

2023818Proof

This follows from the dependent induction principle of set-truncation, as any proposition is a set.

It follows from that is equivalent to . This is enough to deduce the main result.

2023 8 18 https://www.jonmsterling.com/007V/ 007V /007V/ Set-truncation preserves and reflects surjections Theorem

A function is surjective (i.e. an effective epimorphism) if and only if its set-truncation is surjective.

2023818Proof

We deduce that is equivalent to as follows:


	by definition
	by induction
	by
	by
	by
	by definition

Incidentally, the appeal to can be replaced by the more general , which applies to an arbitrary reflective subuniverse and the corresponding subuniverse of separated types.

Read Entire Article