Under the Hood

A new kind of simulator

Posted May 16, 2026 · by Swol & Yellowfive

Every WoW player has had to deal with this moment: You have a new piece of gear in your bags, and a question that should be easy. Is this an upgrade?

The answer is rarely easy. Once you account for stats, set bonuses, talents, the rotation you actually run, the specific damage profile of the fight you're going to do, and the way every buff in the game multiplies against every other buff, "is this an upgrade?" becomes a surprisingly deep optimization problem. There are millions of valid answers, but only a handful are the best.

There are tools to help. You've probably used some of them: sim sites, gear guides, our optimizer. What most players never see is that under the hood, every one of those tools is making a trade-off between accuracy and speed. You can have an answer that's almost perfectly correct, or you can have an answer right now. Historically, you couldn't have both.

The new simulator now running on AMR for retail does something we don't think anyone else has done at this scale: it produces simulator-grade accuracy at the speed of a mathematical model. Below is how it works, what makes it different, and why it matters for gear optimization advice.

How everyone else does it

There are basically two schools of thought for scoring a gear set in WoW.

Monte Carlo simulation is the old way. Set up a virtual character, give it a virtual fight to do, then let the rng gods roll the dice. Crits time up well or are wasted. Procs hit or whiff. Cooldowns line up with damage windows or they don't. Run the fight once, write down the damage. Now do it again. After hundreds or thousands of iterations, the noise averages out and the number you get is a reasonable estimate of how that gear set actually performs.

This is what SimulationCraft (SimC) does. It's also what our old simulator did, the one we ran during Legion, Battle for Azeroth, and Shadowlands. It works because it doesn't have to assume much about how the game behaves — it just plays the game and writes down what happened.

The downside is speed. Slow speed. Each individual iteration is fast, but you need a lot of them to get a stable answer, and the answer you get is for one specific gear set. To optimize gear — to look at every plausible combination of items, gems, enchants, and special effects and pick the best one — you'd need to score hundreds of thousands of configurations. With Monte Carlo, that's hundreds of thousands multiplied by a few thousand iterations. You're not getting an answer anytime soon. Best in Bags would take about a day from when you pressed the button to get the answer we show you in a couple seconds.

Mathematical modeling is the other school, and it's what we've been running on retail since Dragonflight. Instead of simulating the fight, you build an explicit model of what the spec does — the rotation, the buffs, the procs, the multipliers — and you write closed-form equations that take stats in and spit damage out. No dice. No iteration. No waiting. Score a gear set in microseconds.

This is how Best in Bags can score 200,000 to 300,000 configurations every time you click optimize. You can't do that with simulation. You can do it with math.

The trade-off is accuracy. A mathematical model must make simplifying assumptions about how things interact, and the place those assumptions hurt is timing. When a damage buff is up, your other abilities are worth more — and how often that buff aligns with a particular cooldown, or stacks with another buff, is genuinely hard to capture in a closed-form equation. You can get close. You can get very close. But you can't perfectly capture the messy, multiplicative chaos of buff stacking without playing the fight out.

So historically, you picked: accurate but slow, or fast but slightly fuzzy. For most of the last fifteen years, we've moved between those two camps depending on what the game looked like at the time.

What we built instead

The new simulator is a hybrid, and the trick is in how we handle randomness — but the architecture is proprietary, and we're going to leave most of it that way.

Here's what we can say. Every simulator has to deal with the same fundamental problem: random outcomes in the game — crits, procs, dodges — create variance that has to be averaged out before you get a stable answer. Monte Carlo averages it out by running the same fight thousands of times. That's why it's slow.

There's a well-known technique for cutting some of that variance out at the source, which most simulators (including SimC) already use in places: when a random outcome contributes noise without contributing information, you can replace it with its expected value. Weapon damage is the canonical example. Every weapon does a damage range — say 1,000 to 1,500 per swing — and over thousands of swings, the average is the midpoint. There's no point rolling the dice for each hit if you already know what the average will be. So SimC, and most other simulators, just use the average. This is the simplest form of dampened randomness.

We took the same idea much further. Wherever it's mathematically safe to do so — and figuring out where it's safe is real work — we use deterministic or near-deterministic sequences throughout the simulation in place of true randomness. A single iteration produces a result much closer to the average of many iterations than a Monte Carlo run would.

The rest is where fifteen years of mathematical work pays off. Building deterministic models of specs, classic Monte Carlo simulators, machine-learned polynomials, and the scoring engine that runs Best in Bags — every one of those projects taught us something, and the new simulator is what falls out when you combine those lessons in ways that, as far as we can tell, no one else has tried. We're not going to describe those techniques in public. They took years to figure out, and we'd rather hold on to them.

What we will do, in future articles and on the site itself, is expose the simulator's output. You'll be able to see that the timing-dependent interactions a math model approximates are captured correctly. You'll be able to see that the buff-stacking dynamics Monte Carlo captures by playing the fight out are captured here too, in a single deterministic pass. And you'll be able to see that the whole thing runs fast enough to score 200,000–300,000 configurations every time you click optimize.

The proof is in the output, not in the recipe.

Why speed matters more than people realize

Most people don't realize how many gear combinations they're actually choosing between.

Imagine the most stripped-down scenario possible: just two items to choose from in each armor slot, three trinkets to pick two from, three rings to pick two from. That alone is over 36,000 combinations — and we haven't touched gems or enchants yet. Add even the most binary versions of those (one of two gem types for every socket, one of two weapon enchants), and you're at nearly 150,000. With an inventory that small.

Most players have more options than that. A lot more.

150,000 combinations is not something a Monte Carlo simulator is going to chew through. If you want a result in five minutes, you'll need to narrow that down to a few hundred — maybe a thousand if you throw enough hardware (and money) at it. Which raises an awkward question: how are you supposed to do the narrowing?

OK user, narrow it down for me! I'll happily crank out an answer in five minutes — just start by telling me most of the answer.

That's never seemed like a useful trade to us. Instead we do it the other way around. Our combination optimizer prunes the search space automatically — you don't have to pre-pick which items, gems, and enchants to test before you start — and then we simulate the pruned-down set of likely winners and pick the best one from there. The minimum number of sets you actually have to score to get a reliable answer is in the hundreds of thousands, and it has to happen fast enough that nobody's waiting.

That's the obsession. Speed is what lets us actually optimize, instead of asking you to.

Old-school theorycraft, current-decade engine

We've been doing this for a long time. AskMrRobot launched in 2010, and between us we've shipped single-spec simulators, curated stat weights, classic Monte Carlo simulators, machine-learned polynomials fit to simulation output, deterministic mathematical models, and now this thing. We have tried it all. We're independent. We're not chasing anyone else's methodology. We do the math we think is right, and when we think we can improve, we change the math.

The new simulator is the foundation for features we couldn't ship before. Those are coming over the next several months. In the meantime: this is what runs underneath every recommendation the site gives you. Two people, fifteen years, and an excessive interest in getting the answer right.

We will always do the math, so you don't have to.

— Swol and Yellowfive

A new kind of simulator

How everyone else does it

What we built instead

Why speed matters more than people realize

Old-school theorycraft, current-decade engine

Error

There was an error processing your request.