Science

The Secret Equation That Let the US Spy on Everyone

In 2006, the NSA secretly embedded a mathematical backdoor into the Dual EC DRBG cryptographic standard, allowing decryption of SSL traffic across the internet. This article provides a technical deep dive into how the backdoor worked, with working Python code demonstrating that any home computer can exploit it in roughly two minutes.

In 2006, the NSA hid a mathematical backdoor inside a cryptographic standard. The agency denied its existence for eight years. Then Edward Snowden's leaked documents confirmed what researchers had suspected all along: the Dual Elliptic Curve Deterministic Random Bit Generator (Dual EC DRBG), published by the National Institute of Standards and Technology in Special Publication 800-90, contained a deliberate vulnerability that allowed those who knew the secret to predict the output of the random number generator — and therefore decrypt SSL traffic secured with it.

Why Random Number Generators Are the Jugular

Cryptographic security ultimately rests on unpredictability. A private key you can guess is no key at all. When the source of randomness used to generate keys is compromised, the entire edifice of modern encryption collapses. History offers brutal examples:

2009: A flaw in the cookie generation code at Hacker News allowed attackers to hijack user accounts by predicting session tokens
2013: A vulnerability in Android's RNG implementation enabled thieves to derive the private keys of Bitcoin wallets and steal funds

These were accidents. The Dual EC DRBG backdoor was not.

How the Algorithm Works — and How It Was Broken

The Dual EC DRBG operates on elliptic curve arithmetic. It maintains an internal state (a seed) and uses two special points on the P-256 elliptic curve — call them P and Q — to generate pseudorandom output. At each step:

Multiply the current seed by point P to get a new internal state
Multiply that new state by point Q
Take the x-coordinate of the result and output 30 bytes (with 2 bytes discarded)

The mathematical security of this scheme depends on the assumption that the relationship between P and Q is unknown — that they are two independently chosen points on the curve with no computable relationship between them. NIST published their values without explanation of how they were derived.

In 2007, researchers Dan Shumow and Niels Ferguson presented a 15-slide paper at the CRYPTO rump session with a devastating observation: if someone knows a secret integer e satisfying Q = e·P, they can use just 32 bytes of the generator's output to reconstruct the entire internal state and predict all future outputs. This relationship — a scalar multiple connecting Q back to P — is the backdoor.

The origin of the Q constant published by NIST has never been explained.

A Python Implementation

Here is the basic Dual EC DRBG class, faithful to the NIST specification:

class DualEC():
    def __init__(self, seed, P, Q):
        self.seed = seed
        self.P = P
        self.Q = Q

    def next(self):
        # Update internal state
        r = (self.seed * self.P).x
        self.seed = r
        # Generate output
        out = (r * self.Q).x
        # Truncate: discard top 2 bytes, return 30 bytes
        return out & ((1 << 240) - 1)

The 2-byte truncation is the crack in the armor. It was supposedly a security measure. In reality it is what makes the backdoor exploitable: it leaves just enough information for an attacker with knowledge of e to reconstruct the full point and recover the internal state.

Exploiting the Backdoor

The exploitation procedure requires three helper functions:

1. Modular square root on P-256:

def p256_mod_sqrt(x):
    # P-256 prime p ≡ 3 (mod 4), so sqrt(x) = x^((p+1)/4) mod p
    p = 0xFFFFFFFF00000001000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFF
    return pow(x, (p + 1) // 4, p)

2. Recover both candidate curve points from a truncated x-coordinate:

def FindPointOnCurve(x_truncated):
    # The 2 discarded bytes give 2^16 = 65536 possible x values
    candidates = []
    for prefix in range(65536):
        x = (prefix << 240) | x_truncated
        # Check if x^3 - 3x + b is a quadratic residue mod p
        y_sq = (pow(x, 3, p) - 3*x + b) % p
        y = p256_mod_sqrt(y_sq)
        if pow(y, 2, p) == y_sq:
            candidates.append((x, y))
            candidates.append((x, p - y))
    return candidates

3. Predict future output:

def GeneratePrediction(observed_output, e, P, Q):
    # For each candidate point reconstructed from observed output:
    for (x, y) in FindPointOnCurve(observed_output):
        # Multiply by e (the secret) to recover internal state candidate
        r_candidate = (e * Point(x, y)).x
        # Use this candidate state to predict next outputs
        predicted = (r_candidate * Q).x & ((1 << 240) - 1)
        yield predicted

The Numbers

On a home computer, the full brute-force search over the 65,536 candidate x-values and the subsequent elliptic curve multiplications complete in approximately two minutes, recovering 28 bytes of the next output with certainty. The NSA, operating DoD-class supercomputing infrastructure, could run this against captured SSL handshakes in real time at scale — against the entire internet's encrypted traffic.

Timeline of Events

2006: NIST publishes SP 800-90 including Dual EC DRBG
2007: Shumow & Ferguson present backdoor mechanism at CRYPTO
2007–2013: RSA Security ships Dual EC DRBG as the default generator in its BSAFE library (later reporting it received $10 million from NSA to do so)
2013: Snowden documents confirm the backdoor was intentional
2014: NIST withdraws Dual EC DRBG from its recommendations

The episode demonstrated something uncomfortable: the trust infrastructure of internet cryptography — the standards bodies, the reference implementations, the certified libraries — can be subverted at the mathematical level in ways that are technically detectable but practically invisible for years. The equation was secret. The secret was the equation.