retrowin32's third x86 emulator

November 18, 2023

This post is part of a series on retrowin32.

Since my last update, I refined my emulate-via-Rosetta code to the point where a demo ran with some cool graphics!

screenshot of a MacOS window showing the demo
mofo by Psikorp, Dreamhack 1999 64k winner

But I only have a screenshot to share with you and not a URL, because to run on the web retrowin32 uses its own x86 emulator and this demo fails to work under that. It's clearly a bug, but where?

Unfortunately this bug is significantly difficult to track down because it manifests as the logic in the demo quietly just not going quite right, without a crash or any similar smoking gun to point at the flaw. I spent more time trying to get my Windows-native tracing debugger (described in a previous post) to run the program under a similar environment but I couldn't quite get the execution traces to align — the emulator and native Windows were too different.

What I really needed, I thought, was a second emulator that I could run in lockstep with mine such that I could find exactly the point where the two emulators diverge in behavior.

Unicorn emulator

Unicorn is a CPU emulator that is basically QEMU wrapped up as a library. Just what I needed! I "just" needed to retarget retrowin32 to work with Unicorn.

It's never quite so easy, of course. One piece of Windows emulation is that the FS register must point at exactly a thread-specific Windows data structure. In retrowin32's own emulator I just specially handle memory accesses that involve FS. Under Rosetta I added an entry to the LDT for this. But Unicorn more fully emulates the CPU which meant I needed to set up a GDT, not only for FS but all the other segments.

The other biggest hurdle was handling calls between the emulator and emulated code. The emulator hands off control the the emulated executable's main(), which then may call back into the emulator via Windows API calls, which then may need to call back into the emulated executable — for example, the Windows DispatchMessage() API calls the window's registered wndproc. Getting the hooks in exactly the right places to make this work was more challenging than I expected, in part because Unicorn is not exactly well-documented.

In all, Unicorn is a useful tool to be aware of. While we're on the subject, the Qiling Framework wraps Unicorn with OS-specific loaders to let you load executables from multiple different operating systems and poke at them. Digging through their code, I noticed they too struggled with making calls work in both directions.

Bringing up a new emulator

But in retrowin32 today Unicorn seems to work. And now, since I've now brought up Windows emulation via three different x86 emulators, I have a better picture of how to do it. For future reference (or if you too somehow decide writing a Windows emulator is a good idea!) here is a recipe:

  1. Get my winapi executable working. This is a trivial Windows executable that just calls a few Windows APIs; I could have comfortably written it in assembly even. Getting this working means you have the PE loading bits in place and can handle calls from emulated code to the emulator.
  2. Get my zig_hello executable working. This program is superficially even simpler than the previous one but Zig uses the FS register to look up where stdout goes, so it requires figuring all that out.
  3. Get my callback executable working. This program calls into the emulator and passes a callback, exercising the full call stack of calls in both directions.
  4. Finally, pick up any other Windows EXE and draw the rest of the owl.

With this in place, I next hope to set things up such that I can finally compare my x86 emulation against the QEMU emulation to track down exactly which of the many corners I cut ended up mattering.