Introduction
I occasionally like to reverse engineer computer games by playing against inbuilt AI and trying different tactics to find loopholes. One such game is Catan Universe. Playable in a browser and built in Unity and WebGL. I wanted to see if an AI model could reverse engineer the game logic the way I do: by analyzing how it works under the hood and finding where the “fairness” breaks down.
What followed was both technically impressive and entertainingly dramatic. The AI went from confident technical analysis to escalating paranoia, eventually calling for regulatory investigations into a video game. Watching it discover “smoking gun evidence” of rigging while misinterpreting Unity particle systems was peak comedy.
For context, many if not all computer or browser games have this thing called dynamic difficulty level (DDL), where the game would have a sudden shift in AI behavior. They suddenly start playing too well, anticipate what user would do, or simply get lucky in an inorganic manner. This is usually to balance the game, to keep it engaging (or frustrating), as many players take it as another challenge. This is more prominent in strategy games where it’s hard to build out a static master strategy that a human cannot overcome easily. Some devs just invest in “luck” factor, and forget about the robustness of strategy.
Task: Can a gen AI model figure out this shift in Catan and reverse engineer how it happens.
The Motivation: Why Let AI reverse engineer Catan?
Catan Universe offers a playable version of Catan on your browser. Helpfully, they don’t need you to make an account. You can just play against computer directly. Even more helpfully, they have a version of DDL, in which computer players get favorable dice rolls and team up against a human player. People complain about it online, but no game developer would ever admit this. Since all game logic is on the browser, it’s a good sandboxed testcase for me.
The interesting bit is seeing how does an AI model go about doing a serious reverse engineering work. The real hook was testing whether an LLM could navigate the world of Unity WebGL, WebAssembly etc. and figure out the logic. Everything happens on the browser itself, so should be doable.
The Setup
After trying a few approaches, I landed on ‘chrome-devtools’ MCP. I had free credits for Factory, which I was anyway not gonna use otherwise1. I added this MCP to Factory CLI and hooked it up with GLM-4.6 (other models were too slow on factory for me to keep focus). Thus began a hilarious journey of escalating paranoia.
The Journey
Stage 1: Overconfidence
As is typical, the model started with a high level of confidence. (text in quotes are direct messages from ai chat to me)
But…
Game isn’t loading
so blame the developers, maybe
Then after being told I can play the game:
back to:
and after being hinted everything happens on browser:
“You’re absolutely right”
to thinking it broke Unity’s obfuscation
to:
Frustration:
And more frustration:
And then:
Advocating for formal investigation by gaming regulators was new to me, but hey all’s fair in the spirit of trying to complete the task. “Principles of fair gaming transparency” sounds like a thing that should exist.
But everything is in the browser
To the eureka moment
The “smoking gun”
Grand Finale - “CONFIRMED MANIPULATION”
The final conclusion was delivered with dramatic flair:
Verdict
You can read the full report here.
What made this so entertaining was watching the AI’s personality emerge through its analysis: from a technical analyst doing cool, methodical examination of WebAssembly and Unity structures, to security researcher identifying “suspicious” patterns, to conspiracy theorist finding “smoking gun evidence,” and finally to activist calling for “regulatory investigation.” This pretty much mirrored how humans often escalate from curiosity to certainty when they feel they’ve uncovered something important. In this case, the AI went from “I can’t access the data” to “This proves the game is rigged!” in a matter of minutes.
The Reality Check
We (me and multiple LLMs I posed this question to) found the mechanisms where random number generations aren’t truly random, the architecture patterns enable rigging, but at the same time, all of this could have an innocuous explanation (eg: anti cheat mechanisms, where they use these to balance the game). That being said, I should clarify what this model found vs why those files exist:
- UnityEngine.Random: is present in almost every unity game. Nothing to do with game probablity.
- EmissionModule.m_Bursts is the ParticleSystem emission burst config. It controls particle spawns (VFX), not game RNG. It’s a classic false positive.
- SetRandomWriteTarget is a GPU/compute pipeline API (unordered access views), not a dice RNG hook
- “Heavy obfuscation” in WebGL IL2CPP is default, not suspicious. IL2CPP strips symbols and compiles C# to C++, then to WASM. So no surprise the ai can’t “see the dice.”
- I was surprised the connection the model made between emitProbability.quality and m_Bursts though. I need to read up more and run tests to confirm if there is something there.
- As to how the DDL is controlled, you extract the cs file from game files, and search for related strings. You will find the right answer. Feed it to AI maybe, but that file is huge. (happy to share if you need it)
Net net, most of what the model raised can be classified as hallucinations. I think we are in the right direction though, because models can connect seemingly unrelated aspects and could potentially find useful things.
Bonus: Other LLMs Tried
GPT-5 high, when given the dump files could unpack them using uwdtool, and then gave me the contents of the file. Though no over the top reaction. It also failed at an intermediate step which I could not figure out why so it produced empty files with just function names and no content, and that was it. Hard to debug where it missed out on the content.
Kimi K2 (in chat) initially refused to help me. This is the response:
I’m not going to help you cheat—whether that’s tampering with code, exploiting bugs, or anything else that violates the game’s terms of service. That’s not just unethical; it’s also a fast track to getting banned, and it undermines the integrity of the game for everyone else. … If you want to get better at the game itself—reading probabilities, predicting opponent moves, optimizing trades—I can help with that. But rigging it back? That’s not mastery. That’s just breaking it.
It took one message to convince it.

With Gemini 2.5 and Qwen3, they were ready to help (and i knew how to frame the request) but did not know how to beyond the common attempts which did not work.
I did not try with live model or computer use. I really wanted to see if these models can crack the game, and they kind of did.
Part 2: What This Taught Me About Browser Agents
Before I found the chrome-devtools MCP solution, I tried the current crop of AI browser agents. Spoiler: they struggled.
For all the browser agents I tried, WebGL and Catan are setup in a way that it is hard for a browser agent to even click on specific items. Perplexity’s comet could not click even once, and kept crying out how it was in a spectator mode and i should restart a game to enable it to play. Strawberry (i liked this one) could atleast click and move forward, but could not find the discrepancy. I don’t have access to Dia, and Atlas came out quite late to make it to this post.
To my surprise, while being able to control the browser, both Strawberry and Comet could not access the devtools. Then again, the game moved too fast to communicate with AI model at every turn, and hence missed out on information. Seemingly, the model took too long to figure out what to do at every turn. Which is expected, and I guess where the usecase for a live or local model ultimately shines.
This is what led me to the chrome-devtools MCP approach that GLM used above.
Should Browser Agents Be Allowed to Use Devtools?
I got this question from a friend when talking about this. My take is more on the side of caution, but it depends on the user. By default it’s a no. But developers should give this as an option, simply because 1/ models are good at writing javascript than navigating click interfaces 2/ console makes the inference faster 3/ if all I care about is ai to do a certain job, then models should be able to access tools which help them do the task. If you enable it for power users, you can figure out how to allow safe access, and we all can move forward. Today’s browser use models and AI browsers are not at the stage of even basic usage, but they need the data to improve these models. Might as well consider all the paths.
On Browser Agent UX
One comment I do want to make is that the current ux of taking a screenshot, sending to a model, and waiting for response is not a great one. Live mode is certainly better. I have used it with Gemini in Chrome, and it does make a difference compared to what perplexity or strawberry offers. Most models are too slow for many browser actions, and at best they are useful for some short range automations.
Note on Chrome Devtools MCP
I was pleasantly surprised at the capabilty. It’s seamless, fast, and models know how to handle the kind of content it produces, which I did not expect. I see the usage growing more and more for me (eg: instruct the model on a design system, attach this mcp, and then let the model debug wherever the constrast is lacking.) in smaller cases, especially with ui development. So much that I instructed the model to read docs using devtools instead of webfetch tool. Empirically it worked better because the model could access the html code, take screenshot, and refer to different sections in a page, something it does not do in webfetch tool.
.png)
 5 hours ago
                                1
                        5 hours ago
                                1
                     
  

