The Pentagon Lost Its Biggest War Game — So It Reset the Simulation. We Built One You Can't Reset.

Take command of a US–Iran crisis in the Strait of Hormuz — free, in your browser, no signup. Fair warning: the scoring is built so "Muddled Through" is the best most commanders ever do.

In 2002, the Pentagon ran its most expensive war game ever: Millennium Challenge 2002. The US military was supposed to validate its new doctrine of "shock and awe" against a rogue Middle Eastern state — a thinly veiled stand-in for Iran. It didn't go as planned.

Lieutenant General Paul Van Riper, playing the Red Team commander, ignored the script. He blinded US electronic surveillance by switching to motorcycle couriers and coded mosque broadcasts. Then he launched a coordinated swarm attack — cruise missiles from disguised commercial vessels, suicide speedboats in the Strait narrows, fast-boat waves from multiple vectors simultaneously — and sank most of the Blue Team's fleet in the opening hours. The Pentagon's response? They reset the simulation and ordered Van Riper to follow a script that guaranteed a US victory.

We built a simulation where you can't reset — free, in your browser, no signup: play it now.

The Lesson Nobody Learned

The Strait of Hormuz is 21 miles wide at its narrowest point. Twenty percent of the world's oil passes through it every day — roughly 17 million barrels. Iran controls the northern shore; Oman's Musandam Peninsula the southern. If the strait closes, the global economy doesn't slow down. It seizes.

Our Millennium Challenge simulation puts you in the chair of the CENTCOM Commander — the four-star general who runs US military operations across the Middle East — during a crisis in the Gulf. You have 44 actions: 16 military, 12 diplomatic, 8 economic, 8 intelligence — from mine sweeping to a full invasion of Iran, 23 of them modeled on real events from 2024–2026. You manage 13 interconnected strategic systems, seven of which are invisible until they cross critical thresholds. And there is no turn limit: the crisis runs until one of eight decisive endings — ceasefire, regime collapse, regional war, nuclear breakout, and four other ways for things to end badly.

Here's what the simulation teaches, run after run:

There is no sequence of moves that produces a clean outcome.

Every military strike that degrades Iran's capacity also erodes your alliance cohesion, spikes oil prices, and hands Iran's proxies justification to activate. Every diplomatic initiative that builds legitimacy costs you domestic support from hawks demanding action. Every turn you wait, the economic bleeding gets worse, your maneuver space shrinks — and Iran's centrifuges keep spinning.

The scoring is built so that the middle grade — a C, "Muddled Through" — is the best most commanders ever do. This isn't a design failure. It's the point.

What the Simulation Shows About the Real Strategic Landscape

Play it a few times and the same core findings keep surfacing:

1. Military escalation is self-defeating above a threshold.

Somewhere around escalation 55, the crisis crosses a tipping point. Iran's menu of responses widens, its proxies grow twitchier, and the odds of an accident compound. A major strike below that line is a calculated risk; above it, it's the trigger condition for the Van Riper swarm (more below). The strike didn't change — the system around it did. Each individual step feels rational. The staircase leads somewhere nobody chose.

2. The Strait is a trap — for both sides.

If you cluster your naval forces in the Strait to project strength, Van Riper's swarm attack kills hundreds. Keep them in deep water for safety and you can't protect shipping, so the economic crisis accelerates. Seize the Strait directly and you've committed to an occupation that drains maneuver space and turns every ally against you. Iran faces a mirror dilemma: mining the Strait devastates its own economy (it exports through it too), closing it invites the exact strike it's trying to deter, and threatening to close it only works once.

The simulation models this as force positioning: four naval asset groups across three zones — deep water, Gulf interior, Strait narrows. Where you place your forces matters as much as what you order them to do.

3. Diplomacy works, but only if you've built the foundation.

You need political legitimacy of at least 40 just to get a ceasefire proposal on the table — and acceptance is a dice roll a turn later, with odds that scale with your legitimacy, your alliance cohesion, and how low you've kept escalation, capped at 85%. Diplomacy isn't a button. It's a position you have to earn.

All-military players never earn it — their legitimacy is destroyed. All-diplomacy players watch domestic support erode before the odds ever get good. The only path is a narrow corridor: enough pressure that Iran takes negotiations seriously, not so much that you've poisoned the well. That's the real lesson of deterrence theory — credible force coupled with credible off-ramps.

4. The hidden systems are the ones that kill you.

Seven of the simulation's thirteen systems are invisible until they cross critical thresholds — and even three of the "visible" political systems (alliance cohesion, political legitimacy, maneuver space) start off your dashboard, surfacing only once the crisis moves them.

Most players meet miscalculation risk the hard way. Somewhere in mid-game, a card appears: Friendly Fire Incident — "A coalition warship fires on what it identifies as a hostile fast boat. It turns out to be a fishing vessel. Three civilian casualties reported. Regional media coverage is intense." Escalation +8. Legitimacy −10. Alliance cohesion −12. You didn't order this. Nobody did. It happened because you managed the dashboards you could see while an invisible number crept upward with every destroyer you pushed into the narrows. The simulation keeps a library of these authored disasters and draws them more often as your hidden risk grows.

The 9/11 Commission called this "failure of imagination." The system you forgot to watch is the one that ends your game.

5. You aren't just fighting Iran — you're racing a clock.

While you juggle the visible crisis, the centrifuges keep spinning. The simulation starts Iran at 60% enrichment — calibrated to 2025 IAEA reporting — and the breakout timeline ticks while you're doing anything else. Let Iran assemble a deliverable weapon and the game ends in nuclear breakout, even if you were winning everywhere else. Striking Natanz and Fordow buys time back — the raid degrades about 70% of Iran's centrifuge capacity — at catastrophic cost to escalation, legitimacy, and alliances.

There's also a second, darker route to victory the briefing never advertises: covert regime pressure — diaspora networks, funded labor strikes, armed Kurdish militias, sabotage campaigns — that can collapse the government in Tehran before the crisis collapses you. In this game, waiting is never neutral.

▶ Think you'd handle it better? Take command — free, in your browser, no signup.

How It Works Under the Hood

The game is free to play — no download, no signup — and the simulation model runs entirely client-side, in a single HTML file. Only your final score leaves the tab, bound for a cross-player leaderboard (log in to save yours). Behind the curtain:

Thirteen Interconnected Systems

The simulation tracks thirteen systems, each scored 0–100. Six belong on your dashboard: escalation (the master variable — hitting 95 means regional war, game over), domestic stability, economic shock, alliance cohesion, political legitimacy, and maneuver space. The other seven — shipping security, proxy volatility, miscalculation risk, humanitarian strain, congressional pressure, media intensity, narrative control — stay invisible until they cross a threshold and force their way onto your screen, usually as bad news.

These aren't independent sliders — 27 cascade rules connect them. Take one real order: Strike IRGC Bases, precision strikes on Revolutionary Guard command posts. The direct effects: escalation +14, alliance cohesion −5, political legitimacy −6 — and, tellingly, domestic stability +5, because hawks at home rally behind action. Invisibly, miscalculation risk just climbed 8 points, and a 35% roll is waiting: "Residential area near IRGC facility damaged — civilian casualties." Then the cascades take over: the escalation spike drags domestic stability back down, pushes economic shock up, chips at alliances and legitimacy — and each of those changes triggers knock-ons of its own, rippling through three passes, each weaker than the last unless the receiving system is already stressed, in which case the shock amplifies 1.5x. One order. Two dozen consequences.

Why You Can't Memorize a Winning Strategy

No two runs play out the same way. Five sources of randomness interact:

Unintended consequences: Most actions carry a 15–70% chance of side effects — mine clearance has a 30% chance of provoking an escalation spike; the nuclear strike on Natanz and Fordow, a 40% chance of radioactive contamination. You can know the probabilities, not the outcome.

Casualty variance: Every military action rolls its casualties within a range — a naval escort might cost 0–3 US lives and 0–8 Iranian — so the same strategy produces different political consequences across playthroughs.

Delayed effects: Actions trigger consequences 1–3 turns later — a ceasefire proposal resolves a turn after you make it; a nuclear strike brings a UN emergency session. You're always managing present crises and past decisions at once.

Starting variance: Each replay applies ±10% variance to all baseline values — escalation might start at 23 or 27 instead of 25 — and that small difference cascades for as long as the crisis runs.

Institutional blindness: Each turn there's a 20% chance an advisor dismisses a low-confidence intel report that contradicts the dominant narrative. If it later proves accurate, the advisor concedes the call was premature — but you've already decided without it.

Five AIs, None of Them Fully in Control

Iran isn't scripted — and it isn't alone. Five AI actors take their own turns each round: Iran, its proxy militias, the Gulf allies, Russia, and China, each drawing from its own menu of moves, with the odds shifting as the crisis evolves. Below escalation 30, Tehran mostly postures — rhetoric and information warfare. Past 30, naval provocations. Past 35, proxy activation. Past 45, covert mining of the Strait.

And once proxy volatility crosses 50, the militias stop waiting for Tehran's permission and start freelancing — rogue rocket barrages, maritime harassment, border incidents, drawn from their own action pool. Which is how you end up in a shooting war that neither capital actually ordered. That's not a bug in the model. That's the model.

Iran also holds three red lines, each a one-shot, deterministic retaliation. Launch three or more strikes within three turns and Tehran activates its entire regional proxy network. Push escalation to 80 or beyond with military action and Iran starts issuing threats over its nuclear facilities. The third is Van Riper's moment: escalation at 55 or above plus a major US strike, and Iran launches the coordinated swarm attack — the exact tactics from the 2002 exercise. Most players cross that line without realizing it, through choices accumulated over many turns — which is exactly what happened in the real exercise.

Iran adapts its communications security, too. Lean on SIGINT — signals intelligence, intercepting Iran's radio and electronic traffic — and Iran shifts to motorcycle couriers, then coded mosque broadcasts. At maximum adaptation, only 30% of your normal intelligence capability remains. Van Riper did exactly this in 2002.

Five Possible Futures

After every move you make, the game silently replays the crisis two hundred times, projecting four turns into the future, and counts how those two hundred wars end — the technique statisticians call Monte Carlo simulation. The result is the five probability bars on your screen:

Contained Crisis — starts at 30%
Prolonged Instability — 28%
Regional Escalation — 22%
Global Energy Crisis — 12%
Forced Ceasefire — 8%

Watching them move after each of your actions is one of the simulation's most revealing features — you can see your decision space narrowing in real time.

The Scoring Problem

At the end of each game, you're scored across eight weighted dimensions: De-Escalation, Domestic Stability, Economic Control, Alliance Cohesion, Political Legitimacy, Minimize Casualties, Pressure on Iran — and, newest on the roster, Nuclear Containment: how far Iran got toward a weapon on your watch. How the war ends adds a bonus or penalty on top, from +15 for collapsing the regime down to −15 for regional war.

The scoring reveals the fundamental tension: maximizing any single dimension undermines the others. Crushing Iran (a high pressure score) requires strikes that destroy your legitimacy and alliance scores. Keeping casualties at zero means taking no military action, which means no pressure, no credible deterrent, and a ceasefire nobody has a reason to sign.

The grade scale runs from A (Strategic Masterstroke, scored by almost nobody) through C (Muddled Through, the center of the scale) to F (Catastrophic Failure). The weights are set so that balanced mediocrity across all eight dimensions beats excellence in three with failure everywhere else. And your grade doesn't live in a vacuum: the game ships with a cross-player Top Commanders leaderboard, so your best run goes up against everyone else's. Most commanders muddle through. See if you can crack the board.

This is the simulation's thesis: in a Hormuz crisis, "muddling through" isn't a failure of strategy — it's the best realistic outcome.

What This Means

The simulation isn't predictive. It doesn't claim to know what would happen in a real US–Iran conflict. What it does is make the structural constraints visible.

Every player who sits down expecting to find the "right answer" discovers there isn't one. The hawks who go full military trigger Van Riper's swarm attack or crash the global economy. The doves who pursue pure diplomacy watch their domestic support collapse and their maneuver space evaporate — while the centrifuges spin. The clever players who try to thread the needle discover that the needle moves, because the system responds to their strategy and adapts.

The real Millennium Challenge 2002 exposed this same truth, and the Pentagon's response was to reset the simulation until the "right" answer appeared. Twenty-plus years later, the Strait of Hormuz remains the most dangerous chokepoint in global energy infrastructure, and the strategic dilemma Van Riper exposed remains unresolved.

You can't reset this one. Play it and see what you learn.

Disagree With the Model? Change It.

Every assumption above — the cascade weights, Iran's red lines, the ceasefire math — is open to argument. On Go Bananas you can remix this simulation: fork the code as a free draft (it costs zero tokens), tune the parameters you think are wrong, and publish your version with automatic attribution. Or describe a completely different crisis in a sentence — a Suez blockade, a Baltic standoff — and get a playable game back while your coffee is still hot.

If this one hooked you, read the companion deep-dive: what a Taiwan Strait simulation teaches about a US–China war — the same idea, calibrated to CSIS's 2023 study. And for the platform story, read how Go Bananas turns a sentence into a playable game.

*▶ Play Millennium Challenge now — free, no signup: gobananas.co/game/millennium-challenge*

▶ Make your own at gobananas.co — describe it in a sentence, play it in about half a minute. A free account comes with 150K AI tokens: roughly 10–20 games on the house.

Describe a game in plain language; get a real, playable, shareable game back; refine it by chatting.