Forecasters of U.S. presidential elections are carrying on a healthy debate about the power and value of the models they construct. Nate Silver fired the opening salvo with a post arguing that the forecasts aren’t nearly as good as political scientists (and their publishers) claim. John Sides and Lynn Vavreck responded with reasoned defenses, and Brendan Nyhan‘s earlier post on the topic deserves another look in response to Silver’s skepticism as well.
One reason it’s so hard to forecast U.S. presidential elections is that there aren’t that many examples from which to learn. American presidential elections only happen 25 times each century, and the country’s only been around for a couple of those. As if that weren’t enough trouble, it’s hard to imagine that the forces shaping the outcomes of those contests aren’t changing over time. Just 25 election cycles ago, TVs and PCs didn’t exist, and most American homes didn’t even have phones.
Those of us who try to forecast rare forms of political conflict and crisis confront a similar challenge. Right now, I’m working on a model that’s meant to help anticipate onsets of state-sponsored mass killing in countries around the world. Since World War II, there have been only 110 of these “events” worldwide, and they have become even rarer in the two decades since the collapse of the Soviet Union.
The rarity of these atrocious episodes is good news for humanity, of course, but it does make statistical forecasting more difficult. With so few events, statistical models don’t have many cases on which to train, and modelers have to think more carefully about the trade-offs involved in partitioning the data for the kind of out-of-sample cross-validations that offer the most information about the accuracy of their constructs. The same logic applies to wars within and between states, coups, democratic transitions, popular uprisings, and just about everything else I’ve ever been asked to try to forecast.
When modeling events as rare as these in a data set that covers all relevant cases, the utility of the forecasts isn’t in the point estimate of the likelihood that the event will occur. With small samples and noisy data sets, those point estimates are way too uncertain to take literally, and even the most powerful models will never generate predictions that are nearly as precise as we’d like.
Instead, a good starting point for forecasting from rare-events models is a list of all at-risk cases shown in descending order by estimated probability of event occurrence. Most of the countries at the tops and bottoms of these lists will strike their consumers as “no-brainers.” For example, most of us probably don’t need a statistical model to tell us that China is especially susceptible to the onset of civil-resistance campaigns because it’s an authoritarian regime with more than 1 billion citizens. Likewise for a list that tells us Norway is unlikely to break out in civil war this year. Both of those forecasts can be accurate without being especially useful.
The real value of rare-events forecasts comes from the surprises–the cases for which a ranked list generated from a reasonably reliable model contradicts our prior expectations. These deviations provide us with a useful signal to revisit those expectations and, when relevant, to prepare against or even move to prevent that crisis’ occurrence.
Take the recent coup in Mali. While the conventional narrative described this country as a consolidated democracy, a watch list generated from statistical models identified it as one of the countries in the world most likely to suffer a coup attempt in 2012. Had people concerned about Mali’s political stability seen that forecast ahead of time, it might have spurred them to rethink their assumptions and perhaps prepare better for this unfortunate turn of events.
These surprises can cut the other way, too. In January, when I used a model of democratic transitions to generate forecasts for 2012, I was chagrined to see that Egypt ranked pretty far down the list. Now, with the outcome of the transition increasingly in doubt, I’m thinking that forecast wasn’t so bad after all. For concerned observers, a forecast like that could have served as a useful reminder that Egypt still isn’t on a steady glide path to democracy.
Even with well-calibrated models, these “deviations” won’t always prove prescient. A watch list that accurately identified Egypt, Morocco, and Syria as three of the countries most likely to see civil-resistance campaigns emerge in 2011 also ranked North Korea in the top 10 for that year, and nothing in that list or the underlying model could have told us in advance which would be which.
In spite of that imprecision, I think the forecasts worked pretty well. Most of the countries toward the top of the list may not have seen popular uprisings, but nearly all of the uprisings that did occur happened in top 30 countries. Analysts who were surprised to see a civil-resistance campaign erupt in Syria might not have been so surprised if they had seen those forecasts and reconsidered their mental models accordingly.
The broader point is that, when trying to forecast rare events, we shouldn’t get too hung up on the exact values of the predicted probabilities. The model we’re striving for here isn’t an actuarial table that allows us to allocate our dollars and attention as efficiently as possible. Even if policy and advocacy worked that way–and they don’t–the statistics won’t allow it.
A more useful model, I think, is the light on your car’s dashboard that tells you you’re running low on fuel. When that light comes on, you don’t know how far you can drive before you’ll run out of gas, but you do know that you’d better start worrying about refilling soon. The light directs your attention to a potential problem you probably weren’t thinking about a few moments earlier. A reasonably well-calibrated statistical model of rare political events should do the same thing for analysts and other concerned observers, whose attention usually doesn’t get redirected until the engine is already sputtering.