Prompting LLMs is not engineering

6 hours ago 2

With the proliferation of AI models and tools, there's a new industry-wide fascination with snake oil remedies called "prompt engineering".

To put it succinctly, prompt engineering is nothing but an attempt to reverse-engineer a non-deterministic black box for which any of the parameters below are unknown:

  • training set
  • weights
  • constraints on the model
  • layers between you and the model that transform both your input and the model's output that can change at any time
  • availability of compute for your specific query
  • and definitely some more details I haven't thought of

"Prompt engineers" will tell you that some specific ways of prompting some specific models will result in a "better result"... without any criteria for what a "better result" might signify. Whereas it's enough for users in the US to wake up for free/available compute to go down and for all models to get significantly dumber than just an hour prior regardless of any prompt tricks.

Most claims about prompts have as much evidence as homeopathy. When people actually even the tiniest bit of rigorous examination, most claims by prompt "engineers" disappear like the morning dew. For example, prior to the new breed of "thinking" models, chain-of-thought queries were touted as great, amazing, awe-inducing. Sadly, in reality they only improved anything for very narrow hyperspecific queries and had no effect on broader queries even if the same techniques could be applied to them:

https://arxiv.org/pdf/2405.04776

Very specific prompts are more likely to work, but they can require significantly more human labor to craft. Our results indicate that chain of thought prompts may only work consistently within a problem class if the problem class is narrow enough and the examples given are specific to that class

Now that that the models have progressed to OpenAI o3, and Google Gemini 2 Pro, prompt "engineering" has also progressed to Rules for AI and large context windows and other snake oil remedies that are as effective and deterministic as previous ones.

In reality these are just shamanic rituals with outcomes based on faith, fear, or excitement. Engineering it is not.

Read Entire Article