Approach to LLMs and Other Reflections

4 months ago 6

July 4, 2025

Zürich, Schweiz

If you asked me where I fall in the AI camp, I’m in the skeptical-but-curious quadrant (example). Notwithstanding the various legitimate critiques of what everyone is calling AI and doing with it, I have been finding some interesting utility with it in my — I don’t want to say every day but in — my week to week. This is particularly in the space of large language models (LLM)s.

Places and Niches for Use

How and where I’ve found utility with LLM is not quite as proponents have framed or as others have derided it per se.

As a Rubber Duck Paralegal in Exploring Foreign Ecosystems

Coming up with a name for this category is difficult. Let me try to define what I mean. If you’re familiar with rubber ducking, this will click. The premise is this: I’ve found myself in a new land working with something I am not super familiar with.

Several examples:

R Programming Language: I used to be fluent in this back in university, but that was eons ago, and I don’t have the capacity to commit something that I irregularly use — though enjoy — to memory so easily right now in life.
Rust: I’ve been dabbling in it a bit in my spare time, written some small tools, and have several long term hobby projects written in it. I’ve even contributed to some notable projects built in it. That said, my day job doesn’t require me to work with it, and hobby time is so spare that aspects about it do not get committed to memory (e.g., library ecosystem and certain constructs).
Various Python Data Processing Libraries: I used to be an advanced Python programmer, admittedly even too clever for my own good. But related to R above, there are entire library ecosystems with their own ways and idioms in the data processing space. I don’t use them regularly enough to commit them to memory yet.

Now, what do these two have to do with Rubber Ducking? I’ve found using the LLM chat loop interested in the following way: Asking the LLM about a class of problem as a generality in that foreign context and having it explain the problem and solution space.

What are the ways one can join two data frames in the R Programming Language?

What APIs in Rust are most comparable to Go’s bytes.Buffer?

What are the general tradeoffs in choosing whether a user-authored type in Rust implements the Copy trait?

Note how open ended these questions are. I’m not giving the LLM my code (my homework) and asking it to do it (for me), but rather provide high-level counsel that opens the door for deeper exploration. I do not take the LLM output as infallible but rather use it help frame the second round of human research that I do manually with API reference manuals and other primary source literature.

Reasonably well crafted WWW search queries can yield answers to questions like these, yes. The problem is that some ecosystems are not conducive to search due to their name (e.g., “R”) or the query involves punctuation-driven stop words that typical search indexing technology does not play well with (e.g., ‘->).

Matters get more interesting when you consider that ecosystems sometimes have their own domain vocabulary, and trying to search for things without knowing that vocabulary is hard. Consider how the ggplot2 ecosystem refers to “faceting.” How ggplot2 uses this term makes complete sense, but that wouldn’t be the first noun I would use when searching for this (were I unfamiliar with ggplot2 jargon).

So the LLM has been reasonably productive when engaging with it in a semi-Rubber Duck dialogue. I research outside materials further after consulting it, try to formulate a solution by hand, reflect, and iterate.

This is the key part: I am keeping my brain active during this. I’m not asking the LLM to do my work for me, like this:

Write me a small program in the R Programming Language.

This program should output a data frame that is like a time series. The output data frame should include time value as one column and a boolean state for the second column.

The program’s input is the time interval [X, Y].

The program should walk this time interval at minute-level granularity and provide a row in the output data frame for the current time iteration and a boolean state to indicate whether the sun is above the Earth’s horizon at latitude LAT and longitude LONG.

Aside: What I described above is a real R program that I hand wrote without LLM assistance.

The solution the LLM emits might work, or it might not. That part is immaterial. What matters is whether my brain remains actively involved. With this ephemeris prompt, I would be minimally copy pasting the solution. I wouldn’t be reflecting on it and trying to solve the problem myself. I don’t want to be doing that, so that’s now how I use the tool.

Instead, I break this task apart and probably ask several subqueries that I then later write and evaluate myself. Knowing my base knowledge of R, I’d probably skip the LLM and go directly to CRAN to find a reputable package for this type of ephemeris data.

And this is where the active brain part is important: a LLM can be a tool to provide some research discovery shortcuts, but the brain needs to adjudicate this information, verify it, and possibly apply what the LLM has suggested itself. So: learning by doing.

As a non-contrived future example, I could see this being useful for me later on with working with some Lua programs to extend Neovim. The main task would be to surface the requisite parts of the Neovim API in Lua and learn how to use Lua productively in that context. Like R, Lua isn’t something I have been able to commit to memory (yet).

First-Round Survey Research

There’s another twist to the above where using a LLM has been interesting: performing broad information gathering/surveying tasks. This is at least true with Gemini and what it’s calling its “Deep Research” mode today.

What I mean by this is this:

What can you tell about the Osage Hills apartments in Tulsa, Oklahoma? When did they come into existence, what did people think of them, and why were they razed?

Now, I invite you to attempt to answer these questions yourself using whatever means you have available to you. Spoiler: You’ll probably not find a lot of information.

So that might seem like a weird question, right? I grew up in Tulsa and lived not too far away from those apartments. I recall seeing them from the Osage Expressway each and every day. They stood out against everything else (you’re at the Northwest edge of Metropolitan Tulsa, the intersection of urban, of the rolling hills before the Ozarks, the Cross Timbers, and the Great Plains; the visual juxtaposition of the structure was great). They were brick behemoths. And they were also — so it appeared — derelict. So imagine how indelibly that image can get stamped into a young child’s mind.

Through family history in the region, I knew a decent amount about the Osage Hills apartments.

They cropped up after World War II to address an immediate housing shortfall.
They were once very nice.

My father relayed a story about how his father’s friend (who had a son of the same age as my father) cashed in rich by chance by buying some shares of an STP Oil franchise before the firm was really known or anything, enjoyed market appreciation of the shares, selling, and became extraordinarily wealthy from the proceeds.

That family moved to these the Osage Hills apartments after hitting the jackpot. These apartments were seen as an upgrade over whatever the family was living in (presumably a dump south of Tulsa’s downtown before Riverside).

Funnily, my father’s parents (poor) were too burned to buy into what they thought was another snake oil scam in that era, and they remained on the border of being poor-lower middle class until they died.

Honestly, this reminds me a lot about what’s happened with the greater fool theory and cryptocurrency and the meme stocks.

I wonder how that family is doing today (many generations later). Did they retain the wealth and grow it, or was it lost within a generation or two? My parents came from very modest backgrounds. Something like this could have changed their lives.
Urban decay sets in sometime later. My mother refused to drive near these growing up, and these were essentially next to where we lived.

I wonder when the family above moved out. I’d wager the mid-1960s at the latest.
The structures were razed in the 2000s.

Much to my surprise, Gemini using Deep Research mode was able to surface a good number of era-correct primary source documents about this apartment complex. This included items from:

The thing is: a considerable fraction of these materials were not visible through ordinary searches online. Chalk up another one for the dead internet theory, I guess. If I am to discard that oddity:

Many of these were scanned documents into PDFs that Gemini had OCR-ed. So I am moderately impressed with what this could do. That said, I wasn’t interested in Gemini’s generated summary; I was interested in what source material it could surface, and it excelled at that. This augmented my family’s historical familiarity considerably.

Aside: Truth be told, the best way to have garnered this information would have been to have made a trip to the Tulsa County Library and possibly the county records, but I don’t live there; and it’s very far away from where I live.

And the truth is, Gemini’s Deep Research has been fantastic when used in this capacity. There are plenty of other open-ended research inquiries I’ve thrown at it that just work when it comes to whittling down information. Some examples:

I wanted to know exactly where something unremarkable (that nobody except me cares about) was geographically. I had a description of what the area looked like and knew where it could approximately be (within 300 km square). Given the prompt, this was something Gemini’s Deep Research mode nailed essentially perfectly from looking at abstract map data (presumably from Google Maps).
I recall some particularly fucked up movie on television in the early-1990s that featured a scene wherein a woman was fed to some pigs as punishment. I remembered some details about the context of the film: it portrayed the 1800s or earlier, somewhere in the United States, and it featured religious fundamentalists. Every couple of years, I tried to figure out what this thing was using classical search and research. No luck. With a moderately decent prompt (e.g., description of the pig scene, the year and month I saw this on T.V., what television station I thought it was, and where I saw the broadcast locality wise), Gemini’s Deep Research uncovered The Devonsville Terror. I loaded up the film, and — sure enough — this was it. As stupid as it sounds, this level of recall and uncovering surprised me. Moreover, the Deep Research mode corroborated this through digital copies of TV Guide for that era with that locality in Texas.

In short, these models have produced (sub-)Wikipedia-grade material: a good point to dive into for further research.

Closing

The key takeaway for me is an LLM can be very useful if you are judicious in how you use it. A human being who has a good comprehension of both the problem they want to solve and the tools that are available will be far more productive with something like an LLM than a human user of an LLM without that familiarity.

The reason is this: the human with familiarity will have a better sense of what the optimal solution would look like and its shortcomings. The one who just rotely uses what the model spits out won’t — and might as well be copying their desk mate’s answers on a test, not learning anything. Copying answers will give you a false sense of productivity, but it’ll never give you mastery.

In short, I’m not about to lean on any of these tools to replace my own work and thinking, but I might be willing to delegate some mundane tasks and verify what it produces.

Again, with what I stated at the top, nobody should interpret this as shilling for these products similar. I namely wanted to report what I have found useful and surprising. Having something that acts likes an aide can be helpful.s

Another perspective worth sharing: https://bentsukun.ch/posts/ai-nuance/

Navigation:

Read Entire Article