1 · Reversing the binary¶
The problem: all we have is a stripped, optimized 4 MB executable. You can't port a game you can't read — and you can't rebuild what you can't reassemble.
The starting point is a single file: the original Clash of Clans program, compiled in 2012 for 32-bit ARM. There's no source. Tools like IDA Pro can show us a readable, pseudo-C view of what each function does, which is enough for a human to understand the code — but that pseudo-C doesn't compile, and it can't be trusted byte-for-byte. To actually rebuild the game we needed something stronger.
Turning the binary back into source we can rebuild¶
The key move is to regenerate the program as assembly that reassembles into the
exact same bytes as the original — and to split it back out into one file per
original source unit, which the binary's leftover debug symbols still describe.
These are .S files (the .S extension just means "assembly, run it through the
C preprocessor first").
That sounds abstract until you see one. Here is a real example — a tiny method,
Connection::setListener, exactly as the tool emits it:
.globl __ZN10Connection11setListenerEP19IConnectionListener
.weak_def_can_be_hidden __ZN10Connection11setListenerEP19IConnectionListener
.thumb_func
__ZN10Connection11setListenerEP19IConnectionListener:
.short 0x6081 @ str r1, [r0, #8] ; store the listener pointer
.short 0x4770 @ bx lr ; return
Three things are happening here, and they're the whole point of this step:
- The raw bytes are emitted verbatim.
.short 0x6081literally writes the two bytes of the originalstrinstruction. Assemble this file and you get the same machine code back — not a paraphrase of it. If our rebuilt game ever behaves differently from the real one, the difference is something we changed, not a transcription slip. - It's readable. The comment on each line is the disassembled instruction, and
the mangled label (
__ZN10Connection…) decodes toConnection::setListener(IConnectionListener*). We know what we're looking at. - It's marked weak (
.weak_def_can_be_hidden). That single keyword is what makes incremental reverse engineering possible — see below.
Alongside each .S, the tool also emits a .h header recovering the class and
its methods, so the rest of the codebase can call into them by name.
Porting the game logic, one function at a time¶
Because every function is weak, we can swap any one of them for a hand-written C++ version just by linking a normal (strong) definition of the same name — the linker prefers ours, and everything we haven't touched keeps running the verified original bytes.
We didn't use this to rewrite the whole program. We only ported the game logic — the deterministic simulation of villages and battles — because that's the one part the project had to have in clean, readable form: it's what the authoritative server is built from. Everything else (rendering, UI, the iOS glue) was left exactly as recovered and later translated automatically by the recompiler.
Porting that logic was a steady, rigorously checked loop. We rewrote one function in C++, then replayed a thousand recorded battles through both the original and the new version and required the outcomes to match exactly — same units, same damage, same final state. Only a function that passed was accepted; then we moved to the next.
flowchart LR
A["Game running on<br/>100% original bytes"] --> B["Rewrite ONE logic<br/>function in C++"]
B --> C{"Replay 1,000 battles.<br/>Identical result?"}
C -->|yes| D["Accept it.<br/>Next function."]
C -->|no| E["Reject & revisit<br/>the original."]
D --> B
E --> B
Because everything else was still the trusted original, any divergence pointed squarely at the one function we'd just changed. That's how thousands of methods of game logic were ported without ever silently altering behavior — and that clean C++ is exactly what feeds the server in chapter 4.
A note on Objective-C¶
The game is mostly C++, but its glue to iOS — the app delegate, view controllers, device sensors — is Objective-C, and those classes only work if they're registered with the Objective-C runtime. For each one the tool emits a small stub class whose methods are one-line trampolines into the reassembled code, so the runtime sees genuine Objective-C classes while the real logic still runs the original instructions.
What this buys us¶
At the end of this step the whole game exists as source we can rebuild — a byte-identical copy of the original, with every function individually replaceable. From here the project splits in two: the bulk of the program is handed to the recompiler, which translates it to run natively, while the hand-ported game logic flows into the server as the basis for an authoritative, deterministic backend.
Left: the readable pseudo-C IDA gives us. Right: the byte-identical, rebuildable .S we emit from it.