How to Reason from First Principles

3 months ago 5

This post is a follow up on my general questions on AI post, and expands on ideas I published here and here in the ancient times.

Every time one of the labs releases an updated model I give it a thorough shakedown on physics, in the style of the oral examination that is still used in Europe and a few other places. Claude, Grok, Gemini, and GPT are all advancing by leaps and bounds on a wide variety of evals, some of which include rather advanced or technical questions in both math and science, including Physics Olympiad-style problems, or grad school qualifying exams.

And yet, none of these models would be able to pass the physicist Turing test. It’s not even a matter of knowledge, I know of reasonably talented middle schoolers with no specialized physics training who could reason and infer on some of these basic questions in a much more fluent and intuitive way.

For context, I competed in the IPhO in 2005, and still have a copy of my notes (they are short, rough, and not 100% accurate). At that time, the Australian team was drilled rigorously in a particular style of problem solving that, although it took me just two weeks to learn, has powered the two decades of subsequent career development. I taught this method as a TA at Caltech to O(100) students, but I remain surprised that this method is not taught more widely, and it is clear to me that the models, and the people that train them, remain mostly and unfortunately ignorant of the power of this method.

I am writing this post as a single repository of this method that I can refer friends, colleagues, and friendly AIs back to. For reference, here is my current system prompt.

At risk of potentially spoiling my personal evals, I’ll also include examples of questions that I ask the AIs. The most recent models tend either to talk around the subject at length without “understanding” the core question, or addressing it, or they index heavily on a website or blog post that discusses the subject, sometimes even written by me. Neither approach is satisfactory, since I’m looking to help the models create or discover real insights about our world. Obviously these questions refer heavily to areas of my personal obsession and contrarianism!

What is the most powerful nuclear reactor I can fit into a single Starship to be deployed on Mars, in terms of electrical output?
Conventional wisdom calls for a massive expansion of the electrical grid to enable decarbonization of our electricity supply, but may not have priced in recent information on falling battery costs. In what sense do batteries and the grid compete and where does the true equilibrium lie?
Median lunar dust size is on the order of 100 microns, indicating that impact gardening results in some kind of “stable attractor” phenomenon. These particles are much larger than Mars dust, for example. Derive from first principles what the stable attractor on the Moon should be, and relate it to fundamental physical properties of matter and the solar system. I want something like Wein’s Law, I don’t need the full Boltzmann distribution.
When SpaceX figures out the Starship heatshield, what will the answer be?
How to desalinate sea water for $99/acre-foot, assuming no special financing or unobtainium?
Under what assumptions could a nuclear-electric propelled spacecraft out-accelerate a solar sail in the inner solar system?
What is the biochemical explanation for why arboreal squirrels live ten times longer than ground squirrels? (I know it’s downstream of selective pressure and predation, in terms of evolution, but that’s not a biochemical mechanism explanation.)

The Method

The method is called “the seven Ds and the little s”, comprising (in descending order of importance) Diagram, Directions, Definitions, Diagnosis, Derivation, Determination, Dimensions, and substitution.

1. A good diagram captures necessary information and consumes about half a page. The process of drawing a diagram activates the brain’s spatial reasoning capacities and prevents hasty or panicked assumptions. LLMs are bootstrapped on text rather than hunting animals on the African savanna, so may need to perform “problem translation” in some other paradigm, but the key thing is to create numerous hooks into the powerful, intuitive mechanisms of thought.

2. Directions establish the local coordinate system, so that all quantities are defined in a consistent frame of reference.

3. Definitions unambiguously assign symbols to concepts or quantities such as length or mass, and help expedite algebraic manipulation later. 90% of this is following conventions but it’s also an essential part of summarizing and communicating the chain of reasoning necessary to identify and solve the problem. Creating a self-consistent symbolic ontology is also a reliable way to identify and avoid definitional inconsistencies and other sources of confusion.

At the end of the first three steps, the problem has been summarized and read into memory. This may seem pointless or obvious, but asking the right question is often the hardest part of solving open or poorly defined problems. What is the relevant figure of merit? How is optimal defined?

4. Diagnosis is the fourth step, and consists of a single phrase summary of the physical principle used to solve the problem, such as application of a conserved quantity or a vector algebra bash. Usually, the more fundamental the principle, the better. There are about 20 different well-developed methods to solve physics problems, but the top four or five will handily apply to 95% of problems. With enough prompting, the LLMs will usually guess correctly which physical principle is at hand, which means that by the conventional grading process, they’re not quite scraping a pass.

5. Derivation takes fundamental physical equations, of which there are about ten, and transforms them into the necessary form for the problem at hand. For example, rather than memorize the moment of inertia for hundreds of different shapes, one would memorize the general formula and, in the Derivation step, solve an integral to give the desired version. The fundamental equations are linked to the diagnosis. Derivation is complete when the textbook formula has been reproduced from fundamental physical principles. This is distinct from pattern matching across a large library of formulas, plugging in numerical values, and hoping that the answer is correct. “Plug and chug” is always wrong, even if the numbers are accidentally correct.

6. Determination (sometimes referred to as d’Algebra) combines derived equations to produce a formula for the solution of the problem. By convention, this formula is boxed to make it easier to find on the page. This step takes the expansion in derivation and contracts it again to the specific problem, expressing the answer in algebraic terms. This step, conventionally done with a paper and pencil, often requires basic mechanisms of algebraic manipulation, simplifying assumptions (justified), expansions, factorization, etc etc. I remember the thrill of working through a logical procedure to derive the Saha equation, which relates gas ionization to temperature and pressure. There are a few extremely unintuitive consequences of the fundamental physical laws, including the fractional quantum hall effect, black holes, antimatter, the existence of computation, and the existence of life. Yet all these wonders are accessible to a talented high school student following logic, intuition, ancient Greek math technology, and this problem solving algorithm.

7. Dimension check ensures the dimensions on each side of the equation match and is a good “check sum” to detect algebraic mistakes. A successful dimension check is notated with a smiley face. Given that Mathematica has existed for decades, you would think the LLMs would have algebraic manipulation tools that don’t make basic errors, but that is incorrect. Claude and Grok (and humans) routinely drop terms when re-arranging equations, and only a dimension check will find them. Unfortunately, like humans, the LLMs will often spend pages of text justifying a dimensional error rather than realizing they’ve dropped a term. Their tendency to be obsequious can become self-defeating when it comes to critically evaluating their own work. This is also the step where you test conclusions against obvious intuitions, extreme values, and initial assumptions. This is often the step where you first learn something subtle about the system analyzed by the question.

8. Finally, if necessary, numerical values can be substituted in to get a numerical answer. This is the first and only time that a calculator is used. The calculator cannot deliver physical insight.

I could write an entire book on this method expanding on this post, but it would have extremely limited pedagogical utility. The student needs to apply this method dozens of times to understand and interpret it.

LLMs are not very sample efficient, and there just aren’t that many worked examples of this method in the common crawl. Either we generate 100-10,000x more examples of this method being worked for training, or we find a new sample efficient training paradigm, or we get way better at system prompts. Part of the challenge with RLHF is that even among trained post graduate physicists, whose day to day requires solving problems, this way, approximately never, a working knowledge of this method is very rare. But I believe if we want the LLMs to derive novel insights about the physical world, we need to build this capability somehow.

Read Entire Article

How to Reason from First Principles

Related

Have American Institutions Become Overly Feminized?

`Indexer` and `Int` vs. `UInt` in the Mojo Programming Langu...

Walter Isaacson's 'Steve Jobs' (2012)