Gordon Freeman's Broken Door
When porting Half-Life 2 to VR in 2013 at Valve, developers discovered that a door at the very start of the game refused to open -- a bug caused by floating-point precision differences between the original x87 and modern SSE instruction sets.
A recent discussion about the dangers of doors in game development reminded me of a bug caused by a door in Half-Life 2.
I once worked at Valve on virtual reality projects. It was 2013, around the time the Oculus DK1 appeared. Joe Ludwig and I decided that the best way to understand how VR would work in the context of a real game was to port an actual game into it.
We chose Team Fortress 2. TF2 used the Source 1 engine, and it so happened that two other Valve games also built on that engine were Half-Life 2 and Portal 1. So a side effect was that they would work in VR too.
Well, Portal 1 technically "worked," but all the perspective tricks when passing through portals caused genuine nausea, making it practically impossible to play.
HL2, on the other hand, played quite well. Joe spent a fair amount of time getting the boat levels to work properly.
Near the very beginning of the game, there's an episode where you stack boxes on top of each other. In the original it was pretty annoying — the boxes kept falling off — but in VR, stacking them was very easy.
Also, destroying manhacks with the crowbar, which on flat screens looked like panicked flailing, turned into an elegant, precise strike in VR.
Fortunately, there were other reasons to re-release HL2, and the VR version worked pretty well, so we added VR support to the command line, called it a beta, rebuilt all of HL2, and started preparing for release.
Of course, by that point we had been playing HL2 for quite a long time, testing all VR elements for functionality. But we only jumped to the most important chapters, never playing through the game from the very beginning. I hadn't played through the game in a while, so I decided to do it in VR, from start to finish. If I found something that didn't work, I could at least document it in the release notes.
The Bug
So I launched HL2, selected New Game, and started the introduction. This is the famous part of the story: arriving at the train station, Breen's message, the guard makes you pick up the can, and then you need to enter a room... and... I got stuck. I didn't die — I just couldn't move anywhere. I was stuck in a corridor with a guard and couldn't go anywhere.
What's actually supposed to happen is this: the guard (spoiler: it's actually Barney in disguise) knocks on the door, it opens, he orders you to go inside, and then the game waits for the player to enter the room before continuing the script.
But in this case, the door made noise but didn't open, and then closed again, so it was impossible to get into the room. The gates behind had closed, so there was nothing else to do. The guard waits forever, pointing at the closed door, and the player can't proceed.
I searched the Internet for videos, thinking my memory was failing me, but no: the door is supposed to open automatically, after which the player walks through it.
But that doesn't happen!
Oh no, we can't release the game in this state. I gathered people together, including those who originally worked on HL2; yes, it turned out the game was broken. And it was broken even when playing outside of VR — it wasn't caused by anything Joe and I had done. But nobody knew why: the relevant code hadn't changed.
Someone even went back through the source history and compiled the original game exactly as it had been released: nope, the original version was broken too. How could this be? People started going crazy: this wasn't some ordinary bug — it had traveled back in time and infected the original!
While they spent roughly a day relearning how to use the debugging and replay tools, someone clever (unfortunately, I don't remember who) figured out what was going on.
The Root Cause
As you can see in the video, when the door opens, there's a second guard standing inside the room, to the left of the opening door. This guard is standing just a little too close — the very corner of his bounding box intersects with the door's trajectory as it opens. The door starts to open, barely touches the guard's foot, bounces off it, and automatically closes again. And since there's no script to handle this situation and reopen the door, the player gets stuck at this point.
Once they figured this out, fixing the bug was very easy — just move the guard back by roughly a millimeter. Elementary. But figuring it out took a lot of work, because the developers had to relearn the debugging tools and so on.
Great, now they could release the game. But why did it ever work in the first place? The guard's foot was in the door's path in the original too. As I said, they went back and compiled the original with the released source code, and the bug appeared there as well. It had always been there. Why didn't the door lock back before? How was it even possible to release the game?
Floating-Point Precision: x87 vs. SSE
All these questions led them to begin an even longer bug hunt. The answer turned out to be good old floating-point numbers. The original Half-Life 2 was released in 2004, and although the SSE instruction set already existed, it wasn't used everywhere, so most of HL2 was compiled to use the older 8087 (x87) math instruction set. Precision in x87 varied chaotically: some things were calculated in 32 bits, some in 64 bits, some instructions were 80-bit, and the precision you got in any particular piece of code was a rather confusing matter.
But ten years later, in 2013, SSE had been the standard for all x86 processors for a long time — operating systems relied on its presence, so they could rely on it too. Of course, compilers now used it by default: in fact, generating the older (slightly slower) x87 code would have required workarounds. SSE uses much more clearly defined precision (32-bit or 64-bit, depending on what the code requires), making it far more predictable.
So, problem solved? Because of 80-bit precision the collision didn't happen, but with 32-bit it does, meaning more bits is better? Well, not exactly.
The guard's foot intersects with the door's trajectory in both cases — a few millimeters is still far larger than the error margins of either precision. In both the SSE and x87 versions, the door hits the guard's little toe. So far they agree.
This collision is modeled properly — one of HL2's key innovations was its extensive use of a real physics engine. Both the door and the guard are physical objects, both have momentum, both impart impulse to each other, and while the door's hinges have no friction, the guard's boots have a certain degree of friction with the floor.
In both versions, the door has enough momentum to slightly rotate the guard. The guard's friction against the floor isn't quite enough to counteract this, and he rotates by a tiny fraction of a degree. In the x87 version, this tiny rotation is just enough to move his foot out of the door's path, the collision resolves, and the door continues to open. Everything works fine.
But in the SSE version, all this fine-grained precision is only slightly different, and due to the combination of floor friction and object masses, the guard rotates from the collision, but not by quite enough. His foot remains in the door's path. The door bounces back, closes, and the player is stuck forever.
The difference in precision between x87's 80-bit intermediate calculations and SSE's strict 32-bit floats was just enough to tip the physics simulation one way or the other. In the x87 version, the slightly different rounding meant the guard's rotation cleared the door by a hair's breadth. In the SSE version, the rounding went the other way, and the guard's toe stayed put by an equally tiny margin.
It's a perfect example of how floating-point nondeterminism can hide bugs for years. The level geometry was always broken — the guard was always placed too close to the door. But the x87 math accidentally produced a result that worked around the problem, and nobody ever noticed. When the compiler switched to SSE by default, the different rounding behavior revealed what had been lurking all along.
The fix, as mentioned, was trivial: nudge the guard back by about a millimeter so his bounding box no longer intersected the door's sweep. But the investigation — relearning decade-old tools, bisecting builds, and ultimately understanding how a change in floating-point instruction sets could break a door — took far longer than the fix itself.