Nutmeg — World Cup predictor & can a model beat the market?

The scoreboard · model vs reality

How is it actually doing?

Every 2026 World Cup game with the model's pre-match call next to the real result. No hindsight: each prediction used only the games played before it.

Why it never predicts a draw: even for a perfectly even match the model puts the draw at about 28%, versus ~36% for each side — so a team is almost always its single most likely call. It gives draws a fair chance; it just rarely makes one its top pick, which is why drawn games show up as misses here. (It's also why the draw-free knockout accuracy, 69%, runs higher.)

Accuracy · is it real?

How accurate is it, really?

The model's top pick is right about 60% of the time at this World Cup, and 69% across historical knockout games. Both clear chance comfortably — a three-way game is 33% to guess, a knockout is a coin-flip.

And it isn't luck: the accuracy tracks the model's own confidence. The surer it is, the more often it's right — which is exactly what a well-calibrated forecaster looks like.

Historical knockout games · accuracy by how confident the model was

Coin-flip baseline: 50%. The model beats it at every confidence level, and pulls further ahead the more sure it is.

The experiment

The only test I trust

Picking winners is the easy part — the betting price already knows who's likely to win. The real question is whether a model can be sharper than the price.

So the test isn't whether it picks winners. It's whether it can beat the closing line, the last price before kickoff. By then the price has absorbed every public model and all the sharp money, which makes it about as accurate as a price gets. Beat that, or there's no edge to find.

I wrote the pass/fail rule down before running anything, so I couldn't move the goalposts: the model passes only if, on games it never trained on, it can systematically get ahead of the closing price — measured by CLV, whether the price drifts toward the model's calls. A strict test, and a fair one.

What we actually did

Built the simplest model — Elo

One rating per team, learned only from past results, dates, and venue. No players, no injuries. A Davidson draw model turns rating gaps into home/draw/away odds. It only sees games in date order, so it can't peek at the future.

The club benchmark

Walk-forward over 24,359 matches in the five big European leagues, betting value picks at Pinnacle's pre-close price and grading against its close.

Pivoted to the World Cup

Rebuilt for national teams with neutral-venue handling. Tested the hardest case — the knockout stage (binary, no draws) — training on everything else.

Checked it against real money

Pulled live 2026 World Cup prices from Kalshi & Polymarket and measured CLV from Kalshi's own minute-by-minute price history.

Stress-tested every disagreement

A multi-agent research pass took the 12 games where the model disagreed most with the market and asked, with real team news: is the market wrong, or does it know something our score-only model doesn't?

The data

Clubs: football-data.co.uk — results + Pinnacle closing odds, 5 leagues, ~14 seasons. Internationals: martj42/international_results — every national-team game since 1872 (~49k), plus 2026 fixtures and penalty shootouts. Live markets: Kalshi per-match moneylines + price history, and Polymarket.

The results

The market is just a hair sharper.

The model is a genuinely strong forecaster. Put head-to-head with the betting market, though, the market edges it — and the market's own prices turn out to be almost perfectly accurate.

The model vs the betting market

A whisker behind

How often the model & the market agree

Agreement, not accuracy

Is the market really that good? Calibration over 73,077 outcomes

The verification · what the market knew

Where the model disagreed most, the market was right.

I researched each one's team news and group situation. Every apparent edge came down to my model being wrong: a rating quirk, or something the market knew that the model couldn't.

Match

Model's bet

Edge

What the market knew that we didn't

The takeaway

A betting price is a crowd of experts, distilled into one number.

That's the real lesson. A simple model can forecast soccer well, but the market price already folds in every model and every expert, and it's remarkably hard to out-think.

So take this for what it is: a fast, rigorous read on who's likely to win, and a fun way to play out the rest of the World Cup. Pick two teams up top and see for yourself.

Pick two countries. See what the model thinks.

Who wins the 2026 World Cup?