Last November, after the U.S. elections, I wrote a thing for Foreign Policy about persistent constraints on the accuracy of statistical forecasts of politics. The editors called it “Why the World Can’t Have a Nate Silver,” and the point was that much of what people who follow international affairs care about is still a lot harder to forecast accurately than American presidential elections.
One of the examples I cited in that piece was Silver’s poor performance on the U.K.’s 2010 parliamentary elections. Just two years before his forecasts became a conversation piece in American politics, the guy the Economist called “the finest soothsayer this side of Nostradamus” missed pretty badly in what is arguably another of the most information-rich election environments in the world.
A couple of recent election-forecasting efforts only reinforce the point that, the Internet and polling and “math” notwithstanding, this is still hard to do.
The first example comes from political scientist Chris Hanretty, who applied a statistical model to opinion polls to forecast the outcome of Italy’s parliamentary elections. Hanretty’s algorithm indicated that a coalition of center-left parties was virtually certain to win a majority and form the next government, but that’s not what happened. After the dust had settled, Hanretty sifted through the rubble and concluded that “the predictions I made were off because the polls were off.”
Had the exit polls given us reliable information, I could have made an instant prediction that would have been proved right. As it was, the exit polls were wrong, and badly so. This, to me, suggests that the polling industry has made a collective mistake.
The second recent example comes from doctoral candidate Ken Opalo, who used polling as grist for a statistical mill to forecast the outcome of Kenya’s presidential election. Ken’s forecast indicated that Uhuru Kenyatta would get the most votes but would fall short of the 50-percent-plus-one-vote required to win in the first round, making a run-off “almost inevitable.” In fact, Kenyatta cleared the 50-percent threshold in the first try, making him Kenya’s new president-elect. Once again, noisy polling data was apparently to blame. As Ken noted in a blog post before the results were finalized,
Mr. Kenyatta significantly outperformed the national polls leading to the election. I estimated that the national polls over-estimated Odinga’s support by about 3 percentage points. It appears that I may have underestimated their overestimation. I am also beginning to think that their regional weighting was worse than I thought.
As I see it, both of these forecasts were, as Nate Silver puts it in his book, wrong for the right reasons. Both Hanretty and Opalo built models that used the best and most relevant information available to them in a thoughtful way, and neither forecast was wildly off the mark. Instead, it just so happened that modest errors in the forecasts interacted with each country’s electoral rules to produce categorical outcomes that were quite different from the ones the forecasts had led us to expect.
But that’s the rub, isn’t it? Even in the European Union in the Internet age, it’s still hard to predict the outcome of national elections. We’re getting smarter about how to model these things, and our computers can now process more of the models we can imagine, but polling data are still noisy and electoral systems complex.
And that’s elections, where polling data nicely mimic the data-generating process that underlies the events we’re trying to forecast. We don’t have polls telling us what share of the population plans to turn out for anti-government demonstrations or join a rebel group or carry out a coup—and even if we did, we probably wouldn’t trust them. Absent these micro-level data, we turn to proxy measures and indicators of structural opportunities and constraints, but every step away from the choices we’re trying to forecast adds more noise to the result. Agent-based computational models represent a promising alternative, but when it comes to macro-political phenomena like revolutions and state collapses, these systems are still in their infancy.
Don’t get me wrong. I’m thrilled to see more people using statistical models to try to forecast important events in international politics, and I would eagerly pit the forecasts from models like Hanretty’s and Opalo’s against the subjective judgments of individual experts any day. I just think it’s important to avoid prematurely declaring the arrival of a revolution in forecasting political events, to keep reminding ourselves how hard this problem still is. As if the (in)accuracy of our forecasts would let us have it any other way.