Dr. Bayes, or How I Learned to Stop Worrying and Love Updating

Over the past week, I spent a chunk of every morning cycling through the desert around Tucson, Arizona. On Friday, while riding toward my in-laws’ place in the mountains west of town, I heard the roar of a jet overhead. My younger son’s really into flying right now, and Tucson’s home to a bunch of fighter jets, so I reflexively glanced up toward the noise, hoping to spot something that would interest him.

On that first glance, all I saw was an empty patch of deep blue sky. Without effort, my brain immediately retrieved a lesson from middle-school physics, reminding me that the relative speeds of light and sound meant any fast-moving plane would appear ahead of its roar. But which way? Before I glanced up again, I drew on prior knowledge of local patterns to guess that it would almost certainly be to my left, traveling east, and not so far ahead of the sound because it would be flying low as it approached either the airport or the Air Force Base.  Moments after my initial glance, I looked up a second time and immediately spotted the plane where I’d now expected to find it. When I did, I wasn’t surprised to see that it was a commercial jet, not a military one, because most of the air traffic in the area is civilian.

This is Bayesian thinking, and it turns out that we do it all the time.  The essence of Bayesian inference is updating. We humans intuitively form and hold beliefs (estimates) about all kinds of things. Those beliefs are often erroneous, but it turns out that we can make them better by revising (updating) them whenever we encounter new information that pertains to them. Updating is really just a form of learning, but Bayes’ theorem gives us a way to structure that learning that turns out to be very powerful. As cognitive scientists Tom Griffiths and Joshua Tenenbaum summarize in a nice 2006 paper [PDF] called “Statistics and the Bayesian Mind,”

The mathematics of Bayesian belief is set out in the box. The degree to which one should believe in a particular hypothesis h after seeing data d is determined by two factors: the degree to which one believed in it before seeing d, as reflected by the prior probability P(h), and how well it predicts the data d, as reflected in the likelihood, P(d|h).

This might sound like a lot of work or just too arcane to bother, but Griffiths and Tenenbaum argue that we often think that way intuitively. Their paper gives several examples, including predictions about the next result in a series of coin flips and the common tendency to infer causality from clusters that actually arise at random.

The same process appears in my airplane-spotting story. My initial glance is akin to the base rates that are often used as the starting point for Bayesian inference: to see something you hear, look where sound is coming from. When that prediction failed, I went through three rounds of updating before I looked up again—one based on general knowledge about the relative speeds of light and sound, and then a second (direction) and third (commercial vs. military) based on prior observations of local air traffic. My final “prediction” turned out to be right because those local patterns are strong, but even with all that objective information, there was still some uncertainty. Who knows, there could have been an emergency, or a rogue pilot, or an alien invasion…

I’m writing about this because I think it’s interesting, but I also have ulterior motives. A big part of my professional life involves using statistical models to forecast rare political events, and I am deeply frustrated by frequent encounters with people who dismiss statistical forecasts out of hand (see here and here for previous posts on the subject). It’s probably unrealistic of me to think so, but I am hopeful that recognition of the intuitive nature and power of Bayesian updating might make it easier for skeptics to make use of my statistical forecasts and others like them.

I’m a firm believer in the forecasting power of statistical models, so I usually treat a statistical forecast as my initial belief (or prior, in Bayesian jargon) and then only revise that forecast as new information arrives. That strategy is based on another prior, namely, the body of evidence amassed by Phil Tetlock and others that the predictive judgments of individual experts often aren’t very reliable, and that statistical models usually produce more accurate forecasts.

From personal experience I gather that most people, including many analysts and policymakers, don’t share that belief about the power of statistical models for forecasting. Even so, I would like to think those skeptics might still see how Bayes’ rule would allow them to make judicious use of statistical forecasts, even if they trust their own or other experts’ judgments more. After all, to ignore a statistical forecast is equivalent to holding the extreme view that that statistical forecast holds absolutely no useful information. In The Theory that Would Not Die, an entertaining lay history of Bayes’ rule, Sharon Bertsch McGrayne quotes Larry Stone, a statistician who used the theorem to help find a nuclear submarine that went missing in 1968, as saying that, “Discarding one of the pieces of information is in effect making the subjective judgment that its weight is zero and the other weight is one.”

So instead of rejecting the statistical forecast out of hand, why not update in response to it? When the statistical forecast closely accords with your prior belief, it will strengthen your confidence in that judgment, and rightly so. When the statistical forecast diverges from your prior belief, Bayes’ theorem offers a structured but simple way to arrive at a new estimate. Experience shows that this deliberate updating will produce more accurate forecasts than the willful myopia involved in ignoring the new information the statistical model has provided. And, as a kind of bonus, the deliberation involved in estimating the conditional probabilities Bayes’ theorem requires may help to clarify your thinking about the underlying processes involved and the sensitivity of your forecasts to certain assumptions.

PS. For some nice worked examples of Bayesian updating, see Appendix B of The Theory that Would Not Die or Chapter 8 of Nate Silver’s book, The Signal and the Noise. And thanks to Paul Meinshausen for pointing out the paper by Griffiths and Tenenbaum, and to Jay Yonamine for recommending The Theory That Would Not Die.

Follow

Get every new post delivered to your Inbox.

Join 5,818 other followers

%d bloggers like this: