Pokemon NPCs Powered by Local LLMs

0 reads

Nurse Joy Mogged

I ROM-hacked Pokemon Emerald so that every line of dialogue is generated on the fly by a local LLM. It's a scrappy proof of concept, but it works, and getting it working was surprisingly easy. Here's how I built it and why I think this is a glimpse of where gaming is headed.

Vibe ROM Hacking

This was my first foray into ROM hacking. I've always been interested in the idea of modding or "hacking" the games I grew up with, but coding agents have finally lowered the barrier to entry for me to try it out in my free time.

I found Pokemon to be a great starting point because it's simple, nostalgic, turn-based and text-heavy, with discrete game state that's easy to reason about. Plus, the ROMs themselves are very small, typically around 20MB. The mGBA emulator also has some great features that make it very easy to experiment with GBA ROMs on your PC.

A ROM is the chip on a GBA cartridge that holds the bytes of the game's contents. ROMs can be extracted from GBA cartridges and transformed into a .gba file, which can then be reverse engineered and modified or "hacked" to change the internals of how the game works. Modified ROMs can be loaded back onto GBA cartridges or played via an emulator.

ROM Cartridge Diagram

To my surprise, most of the original Pokemon games have been fully reverse engineered and decompiled from the ROM binaries into C source code. This means today, you can just... git clone Pokemon Emerald and build the ROM from source. Mods or ROM hacks become source code changes. I found the Emerald decompiled source code here alongside many other classic Pokemon games: https://pret.github.io/.

Standing on the shoulders of giants, and armed with your coding agent of choice, you can "vibe ROM hack" very easily today. It took me about ten minutes to get the source code building and to figure out how to make basic changes like modifying player status. From there I decided to take a swing at something more ambitious.

Generative NPC Dialogue

Since GBA games have such low resource requirements, if you are running them on an emulator on your PC, your GPU is most likely just sitting there idle. This gave me the idea to run a local LLM and integrate it into the game dialogue system.

The mGBA emulator has a built-in scripting system where you can write and execute Lua scripts that can act as a pipe to the game memory. This means you can read and modify the game memory in real time.

On the ROM side, I found that every NPC dialogue block in the game passes through the same function. I basically turned that function into a mailbox. The ROM drops off the original text and raises a flag. The Lua script picks it up, drops in the LLM rewrite, and lowers the flag.

For the local LLM, I used a 4-bit quantization of Google Gemma 4 E4B-it, which only requires ~6GB of VRAM. I found that this worked well for my use case, but you could certainly swap this out for a more powerful local model, or even a frontier model via API.

At the prompting layer, I kept things simple and just injected the same system prompt into every dialogue block. This prompt effectively passes the original content through a thematic filter. The cover image, for example, is from a version of Pokemon Emerald where all the characters are obsessed with Gen Z slang.

I committed the source code changes to a branch here: https://github.com/owenmccadden/pokeemerald/tree/local-llm-powered-npcs.

Where This Could Go

While this is a fun proof of concept, it's not very usable in its current form. The biggest pain point is the latency waiting for the LLM. It takes around 2-3 seconds for dialogue to show up after interacting with an NPC. This gets old fast, especially when using the fast forward toggle available in the mGBA emulator.

If I were to actually ship this, I would probably pre-generate all the dialogue ahead of time and bake it into the ROM. That would solve the latency problem, but the tradeoff is that the world remains more static and deterministic.

Some other avenues to consider include:

  • Allowing the LLM to use tools mapped to NPC actions such as movement, interacting with items, offering to trade Pokemon, etc.
  • Applying different prompts to different characters, factions, and scenarios rather than using a monolithic system prompt across the whole game
  • Giving NPCs persistent memory so that future interactions build upon previous ones

These ideas are focused on generative NPCs, but they hint at a broader trend in gaming. RPGs have always offered a finite level of freedom within a fixed world. You can go anywhere, talk to anyone, make your own choices, but the underlying world is deterministic. With AI models in the loop, that constraint is starting to loosen. Text-based models provide new possibilities, but world models take that to another level. These technologies point toward a new generation of RPGs where the world is truly dynamic and shaped by the player, one interaction at a time.