Stay inside your budget by going outside your own boardroom.

If you read the Planning Fallacy piece, you’ll know where this is going.

Leaders consistently underestimate time, cost and risk. They overestimate the value of their own prior experience. And critically — when presented with evidence that their estimates are wrong — they tend to ignore it.

The Planning Fallacy is a cognitive flaw we are all susceptible to. Which means the antidote isn’t better intentions. It’s a better method.

Reference Class Forecasting is that method. And even I forget to use it properly.

What it actually is.

Kahneman and Tversky — whose work sits underneath most of what I write about in this series — described Reference Class Forecasting as a deliberate procedure to eliminate two specific biases:

  • “Non-regressiveness” in predictions — we don’t account for enough variables when we estimate

  • Overconfidence in precision — we get too attached to our own “special” knowledge of the situation

The fix is deceptively simple: instead of forecasting from the inside out — using your own experience, your own team’s assumptions, your own misguided optimism about this particular project — you forecast from the outside in. You find a relevant reference class of comparable projects or experiences, look at what actually happened across that distribution, and use it to calibrate your own prediction.

Here’s a quick exercise to make this concrete.

Think about the last book you read. How many copies do you think it sold?

Most people guess somewhere between tens of thousands and hundreds of thousands.

The reality: selling just 5,000 copies is considered a strong result for a debut author. And that’s before you add the regression variables — fiction or non-fiction? Hardback or paperback? First release or backlist?

Each variable narrows the distribution and gives you a far more calibrated starting point than your gut instinct ever could.

That’s Reference Class Forecasting. Distributional information, used deliberately, as a check on inside-view thinking.

Now apply it to your next programme estimate. Suddenly the question isn’t “how long do we think this will take?” It’s “what does the distribution of comparable programmes tell us about how long this actually takes?”

Those are very different questions — and they tend to produce very different answers.

Speaking of questions....working through these kinds of thoughts in a current programme or upcoming complex change?

Pragma is the AI change advisor I’ve built for exactly this kind of outside-in conversation. Free to try, no sign-up. app.pragmaticchange.com.au

Why Flyvbjerg is the person to listen to on this.

Bent Flyvbjerg has spent his career doing what most organisations never do: systematically collecting data on what actually happens in large projects, across sectors, at scale.

His findings are sobering. Across a dataset of 16,000 projects, cost overruns and schedule slippage are not the exception — they are the norm. The Sydney Opera House isn’t a cautionary tale about one famously bungled project. It’s a near-perfect illustration of what happens when inside-view planning goes unchecked.

What makes Flyvbjerg’s work so useful for practitioners is that he isn’t just diagnosing the problem. He’s building the reference class. His datasets are the outside view. They are the distributional information that most organisations are currently ignoring because they’re too busy building their own bespoke, optimistic, inside-view project plans.

The Goldilocks data problem.

Here’s the honest version of what implementing Reference Class Forecasting actually requires — and why most organisations fall short.

You need three things:

1. Genuine executive commitment to outside-in thinking. Not novel advice — but worth stating plainly. If your senior leaders are emotionally invested in their inside-view estimates, no methodology will save you. The Dunning-Kruger effect and the Planning Fallacy are close cousins here: the most confident voice in the room is often the one with the narrowest reference class.

2. A strong facilitator. Introducing reference class thinking in a room full of senior leaders who’ve already committed to a number is a delicate exercise. It requires someone who can introduce the concept without threatening egos, surface good distributional data, call out bias respectfully, and steer the process across what is often multiple sessions.

This is also usually not a one-off workshop. Self-serving to say as a facilitator who runs these sessions — but true.

3. Goldilocks data — and this is the hard part.

Not too hot. Not too cold. Just right.

Too hot: data that’s been sanitised by internal politics. Every organisation has project post-mortems that tell a more flattering story than the actual experience warranted. Budget overruns get reclassified. Scope changes get used to explain away delivery failures. Data corrupted by the accountability sink of a complex organisation is worse than no data at all — it confirms the inside view rather than challenging it.

The same dynamic I’ve written about in the context of AI adoption applies here: organisational silos don’t just fragment data, they quietly corrupt it.

Too cold: data that’s stale, incomplete, or from contexts so different from yours that the reference class doesn’t hold. A dataset of infrastructure megaprojects is interesting context for a large ERP implementation — but it’s not a direct comparable.

Just right: recent, relevant, honestly reported data from organisations and projects that genuinely resemble yours. This is surprisingly hard to come by.

Large organisations often have a lot of data — but it’s fragmented across silos and politically compromised at the edges. Small organisations often don’t have enough of it, or lack the measurement consistency to make it useful. For both, the honest answer is usually that you need some external data — industry benchmarks, sector statistics, published research — combined with whatever Goldilocks internal data you can trust.

A shout out here to my Lean Six Sigma colleagues, who go deep on Measurement Systems Analysis for exactly this reason. Knowing whether your measurement system is actually measuring what you think it’s measuring is the unglamorous prerequisite to any serious use of data in planning. It’s not the headline act — but without it, the headline act falls apart.

What this looks like in practice.

I want to be honest about something: Reference Class Forecasting done properly is a significant investment. The full Flyvbjerg methodology — with rigorous data collection, proper regression analysis, and structured facilitation across an executive team — is not something you knock out in an afternoon.

But a lighter version of the thinking is accessible to almost any leadership team, right now, with what they have. As per usual, the act of even considering this approach forces better thinking.

The practical question to introduce in your next planning session: “What does the outside world tell us about projects like this one?”

Ask your team to look beyond their own experience. Find three comparable programmes — ideally from outside your organisation, ideally with honest outcome data. Ask what they cost, how long they took, and what went wrong. Use that as your calibration before you lock in your own estimates.

It won’t give you Flyvbjerg’s statistical rigour. But it will surface assumptions your inside-view planning has been quietly ignoring — and that alone is worth the hour.

The Planning Fallacy is the diagnosis. Reference Class Forecasting is the discipline. You need both — and the second one only works if you’re willing to look somewhere other than your own boardroom for the answer.

Bent Flyvbjerg’s work on megaproject planning and reference class forecasting is collected in How Big Things Get Done and his associated HBR research. The original Kahneman and Tversky framing of reference class forecasting appears in their foundational work on planning bias. Both are worth your time.