Calculate whether you should use AI

1 month ago 7

Yea yea I know, so many people on Hacker News have so many different opinions about AI Coding. Some people say that AI coding has completely changed programming, while others claim that AI coding is merely wasting people’s time. Although most people are not so binary in their views, I can’t help but ponder why there is such a significant divide regarding AI coding. Even among my peers in the industry, this great divergence exists. Actually, I’m also an AI coding user, from the earliest Copilot to the current Claude Code, I have kept up with the trends of the times. However, personally, I have also experienced many changes in my attitude toward AI coding. In this article, I would like to use a mathematical way to answer why people’s experiences with AI coding are so different, and whether we can use a formula to determine if our projects should use AI coding.

Modeling

Since we are going to use mathematical methods to address this question, we first need to clarify which variables affect the effectiveness of AI coding. Only then can we model the relationships between these variables to derive a scoring model.

Firstly, the most important factor is, of course, the ability of the AI itself. This includes the capabilities of the model itself, such as the size of the context it can handle, its understanding and analytical capabilities, and so on. Additionally, it encompasses the capabilities of AI coding tools, such as Claude Code can do much more than simply answer questions like a chatbot; it can directly participate in project development, debugging, and even deployment. In other words, the ability of the AI itself directly defines the upper and lower limits of AI coding, determining what it can accomplish.

Secondly, we naturally consider the complexity of the project itself. This complexity includes the project’s scale, the amount of code, dependencies, business logic, and so on. It goes without saying that the lower the complexity of the project, the more adeptly AI coding can perform; conversely, higher complexity often requires more contextual information and deeper understanding, thus raising the demands on AI capabilities.

The above two points are almost what everyone can immediately think of, but beyond that, there are several other very important variables that need to be considered. These hidden variables subtly and profoundly influence the final outcome through the law of the short board. We might use the first two points as a foundation to divide our model into capability and goal layers, and then continue digging deeper.

Capability Layer

In the previous text, we mentioned that the ability of the AI itself directly defines the upper and lower limits of AI coding. The span of this upper and lower limit might be very large, and ultimately, how much capability can be exerted also depends on the human developer’s control and usage abilities. The so-called control and usage abilities include whether one can fully utilize the various functions of AI tools, effectively interact with the AI model, and review the code output by AI effectively.

Furthermore, we need to consider the ecosystem of the technology selection itself. Whether it’s a programming language or a specific tech stack being used, the maturity and activity level of its ecosystem will significantly affect the results of AI coding. For instance, a vibrant ecosystem like shadcn/ui, which has rich documentation and even provides MCP Server, would likely yield better AI coding results compared to an obscure frontend library with almost no discussions on Stack Overflow.

Therefore, the capability layer is ultimately determined by three variables:

AI Capability: \(A \in [0,1]\)
Human developer skill level: \(H \in [0,1]\)
Ecosystem maturity of the technology selection: \(L \in [0,1]\)

The relationship between these three variables is not merely additive. Suppose we have a very powerful model and an easy-to-use tool; in such a case, even if the human developer’s skill level is not strong and only knows some basic debugging, writing something simple like a Python script in a very mature ecosystem wouldn’t perform poorly. Conversely, if the human developer has high skill but the model and tools are very poor, it’s often better to write it oneself. Thus, in reality, these three factors have a weighted relationship:

\[ Cap = w_A A + w_H H + w_L L\qquad \begin{cases} w_A + w_H + w_L = 1,\\ w_A > w_H > w_L \ge 0,\\ (w_A,w_H,w_L) \approx (0.5,\,0.3,\,0.2) \end{cases} \]

In this way, we can calculate the score of the capability layer \(Cap\) through a formalized method.

Goal Layer

Continuing to look at the goal layer, we have already mentioned the complexity of the goals, but that is far from enough. Like in the capability layer, sufficient information is crucial; however, what we need to provide here is not information about technical options or architectural syntax but a sufficient description of the goals we aim to achieve, essentially feeding in information. Taking front-end projects as an example, under the same technical selection conditions, instead of simply telling the AI what kind of interface we want to create, if we could directly provide a design draft and requirement specifications along with an OpenAPI definition, the effect of AI Coding would absolutely be extraordinary. Some may argue that providing too much information to the AI sometimes yields worse results. It is precisely for this reason that the information feeding mentioned here refers to information density rather than just the total amount of information.

In addition to complexity and information density, we also need to consider error tolerance. If the technical selection of a project has very strict version requirements, a complex deployment environment, and the failure of a small module would cause the entire project to crash, then the success rate of AI Coding will visibly decrease. Conversely, if a project’s technical selection has very good version compatibility, a general deployment environment, and the failure of a small module only raises a warning without affecting the overall structure, then the success rate of AI Coding will naturally be high, and the part that human developers need to participate in will shrink.

So, we also obtain three variables at the goal layer:

Goal Complexity: \(C \in [0,1]\)
Information Density: \(D \in [0,1]\)
Error Tolerance: \(T \in [0,1]\)

For ease of calculation, we transform the goal complexity into a level of ease of completion:

\[ C_{eff} = 1 - C \]

This leads to the calculation formula for the goal layer:

\[ Goal = w_C (1-C) + w_D D + w_T T\qquad \begin{cases} w_C + w_D + w_T = 1,\\ w_C > w_D > w_T \ge 0,\\ (w_C,w_D,w_T) \approx (0.5,\,0.3,\,0.2) \end{cases} \]

Formula

With the capability layer and the goal layer in place, we can easily obtain the final scoring formula:

\[ E = Cap \cdot Goal = (w_A A + w_H H + w_L L) \cdot (w_C (1-C) + w_D D + w_T T) \]

When all conditions are ideal: that is, AI is omnipotent, human developers are all-knowing, the selection ecology is incredibly mature, the goals are incredibly simple, the information feeding is incredibly sufficient and efficient, and the tolerance of error is incredibly lenient, then:

\[ E_{max} = Cap_{max} \cdot Goal_{max} = 1 \cdot 1 = 1 \]

And when all conditions are at their worst: that is, AI is useless, human developers know nothing, the selection ecology has nothing, the goals are extremely complex, the information feeding is extremely poor, and the tolerance of error is extremely strict, then:

\[ E_{min} = Cap_{min} \cdot Goal_{min} = 0 \cdot 0 = 0 \]

In other words, the scoring \(E\) ranges from \([0,1]\); the closer it is to 1, the more suitable it is to use AI Coding, and the closer it is to 0, the less suitable it is to use AI Coding.

Examples

Let’s take a few examples.

Python Script

Suppose a seasoned Linux user suddenly has the idea to write a simple Python script using the latest AI coding models and tools. We can consider:

AI capability: \(A = 0.9\)
Human developer skill level: \(H = 0.9\)
Ecosystem maturity: \(L = 0.9\)

For the goal layer:

Goal complexity: \(C = 0.1\)
Information density: \(D = 0.9\)
Error tolerance: \(T = 0.9\)

Then we can obtain:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.9 + 0.2 \times 0.9 = 0.9 \]\[ Goal = 0.5 \times (1-0.1) + 0.3 \times 0.9 + 0.2 \times 0.9 = 0.9 \]\[ E = 0.9 \times 0.9 = 0.81 \]

The answer is obvious. But what about a novice developer?

AI capability: \(A = 0.9\)
Human developer skill level: \(H = 0.3\)
Ecosystem maturity: \(L = 0.9\)
Goal complexity: \(C = 0.1\)
Information density: \(D = 0.9\)
Error tolerance: \(T = 0.9\)

Then we can obtain:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.3 + 0.2 \times 0.9 = 0.72 \]\[ Goal = 0.5 \times (1-0.1) + 0.3 \times 0.9 + 0.2 \times 0.9 = 0.9 \]\[ E = 0.72 \times 0.9 \approx 0.65 \]

Although the score has declined, it is still above the average level, indicating that AI coding can still handle basic Python scripts comfortably. What about a complete non-developer?

Model and tool capability: \(A = 0.9\)
Human developer skill level: \(H = 0.1\)
Ecosystem maturity: \(L = 0.9\)
Goal complexity: \(C = 0.1\)
Information density: \(D = 0.9\)
Error tolerance: \(T = 0.9\)

Then we can obtain:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.1 + 0.2 \times 0.9 = 0.66 \]\[ Goal = 0.5 \times (1-0.1) + 0.3 \times 0.9 + 0.2 \times 0.9 = 0.9 \]\[ E = 0.66 \times 0.9 \approx 0.57 \]

The score continues to decline, but remains slightly above average, this must be the AI coding that is changing the world, as touted in some YouTuber videos, lol.

Web Panel

Assuming a medium-level front-end developer plans to use AI Coding to write an enterprise-level Web panel under the condition that the back-end interface is basically complete, we can consider:

AI capability: \(A = 0.9\)
Human developer level: \(H = 0.5\)
Ecosystem maturity of selection: \(L = 0.8\)

For the goal layer:

Goal complexity: \(C = 0.5\)
Information density: \(D = 0.7\)
Error tolerance: \(T = 0.7\)

Then we can derive:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.5 + 0.2 \times 0.8 = 0.76 \]\[ Goal = 0.5 \times (1-0.5) + 0.3 \times 0.7 + 0.2 \times 0.7 = 0.6 \]\[ E = 0.76 \times 0.6 \approx 0.46 \]

Although this score is far from the previous example, it is basically in the middle range, indicating that AI can be utilized, but the developer’s involvement is also necessary and can’t be completely handed over to AI. If the same system is handled by a novice developer who chose an obscure front-end framework, what would happen?

AI capability: \(A = 0.9\)
Human developer level: \(H = 0.2\)
Ecosystem maturity of selection: \(L = 0.3\)
Goal complexity: \(C = 0.5\)
Information density: \(D = 0.5\)
Error tolerance: \(T = 0.5\)

Then we can derive:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.2 + 0.2 \times 0.3 = 0.57 \]\[ Goal = 0.5 \times (1-0.5) + 0.3 \times 0.5 + 0.2 \times 0.5 = 0.5 \]\[ E = 0.57 \times 0.5 \approx 0.29 \]

It can be seen that due to the novice developer’s lower level and the immature ecosystem choice, under the condition that the capability of AI remains unchanged, the score has significantly dropped. This further confirms our observation that for the same project goal, even using exactly the same AI models and tools, the outcomes produced by individuals of different skill levels can vary greatly.

I can’t help but wonder, if a senior developer were to use a mature front-end framework but didn’t provide enough information to feed the AI, such as not giving the back-end interface design documentation, having vague requirement descriptions, and not providing design drafts, what would happen?

AI capability: \(A = 0.9\)
Human developer level: \(H = 0.9\)
Ecosystem maturity of selection: \(L = 0.8\)
Goal complexity: \(C = 0.5\)
Information density: \(D = 0.2\)
Error tolerance: \(T = 0.7\)

Then we can derive:

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.9 + 0.2 \times 0.8 = 0.88 \]\[ Goal = 0.5 \times (1-0.5) + 0.3 \times 0.2 + 0.2 \times 0.7 = 0.45 \]\[ E = 0.88 \times 0.45 = 0.396 \]

Interestingly, this result is even less than that of the medium-level developer who provided sufficient information. What conclusion does this result suggest? We can think for ourselves.

MCP

Now suppose a developer chooses a very new technology stack that is so new that the community ecosystem is not very active; based on our modeling, we can derive:

Model and tool capability: \(A = 0.9\)
Human developer level: \(H = 0.7\)
Ecosystem maturity of selection: \(L = 0.2\)
Target complexity: \(C = 0.5\)
Information density: \(D = 0.8\)
Error tolerance: \(T = 0.8\)

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.7 + 0.2 \times 0.2 = 0.7 \]\[ Goal = 0.5 \times (1-0.5) + 0.3 \times 0.8 + 0.2 \times 0.8 = 0.65 \]\[ E = 0.7 \times 0.65 \approx 0.46 \]

At this score, the developer will need to make an active effort and collaborate with AI. However, they quickly discover that this technology stack provides the MCP Server, thus immediately easing the ecosystem issue:

Model and tool capability: \(A = 0.9\)
Human developer level: \(H = 0.7\)
Ecosystem maturity of selection: \(L = 0.8\)
Target complexity: \(C = 0.5\)
Information density: \(D = 0.8\)
Error tolerance: \(T = 0.8\)

\[ Cap = 0.5 \times 0.9 + 0.3 \times 0.7 + 0.2 \times 0.8 = 0.82 \]\[ Goal = 0.5 \times (1-0.5) + 0.3 \times 0.8 + 0.2 \times 0.8 = 0.65 \]\[ E = 0.82 \times 0.65 \approx 0.53 \]

Essentially achieving a middle-level score, somewhat alleviating the developer’s workload, this also illustrates the assistance provided by the introduction of MCP, which, although not revolutionary, is certainly not to be overlooked.

In fact, there are many very interesting examples that I won’t elaborate on here. Through mathematical modeling, we can quantify the applicability of AI Coding, and with a formalized scoring model, we will be able to measure more accurately whether a project should introduce AI tools, and what aspects we need to pay attention to in order to maximize their effectiveness after introducing AI.

Generalization

And, I believe this formula is not only applicable to AI Coding, but also to the application of AI in other fields, such as content creation, data analysis, and even scientific research. Based on the capability layer and goal layer mentioned above, we can generalize it as:

AI Capability: \(A\)
Human Capability: \(H\)
Domain Ecology: \(L\)
Task Complexity: \(C\)
Information Density: \(D\)
Error Tolerance Rate: \(T\)

Thus, we derive a universal scoring model:

\[ E = (w_A A + w_H H + w_L L) \cdot (w_C (1-C) + w_D D + w_T T) \]

Using food content creation and CAD design in engineering as a comparison:

Food Content Creation

\[ E = (0.5 \times 0.9 + 0.3 \times 0.3 + 0.2 \times 0.9) \cdot (0.5 \times (1-0.2) + 0.3 \times 0.8 + 0.2 \times 0.9) = 0.82 \]

Engineering CAD Design

\[ E = (0.5 \times 0.9 + 0.3 \times 0.7 + 0.2 \times 0.2) \cdot (0.5 \times (1-0.9) + 0.3 \times 0.6 + 0.2 \times 0.1) \approx 0.18 \]

It can be seen that with the same AI models and tools, even when human capabilities are stronger, the things AI can do in unfamiliar ecosystems and complex tasks are very limited. However, for simple tasks and mature ecosystems, even if human abilities are not strong or even inexperienced, AI can still achieve good results. This is also why, despite the flourishing development of AI capabilities today, we still don’t see AI widely used as a productivity tool in traditional industries. However, I believe that as various traditional industry ecosystems continue to advance in digital transformation, the application of AI as a productivity tool will become more extensive. Regardless, as humans, we will always be indispensable.

\[ E = (0.5 \times 1 + 0.3 \times 0 + 0.2 \times 1) \cdot (0.5 \times (1-0) + 0.3 \times 1 + 0.2 \times 1) = 0.7 \]

Read Entire Article