Why political scientists should predict

Last week, Hans Noel wrote a post for Mischiefs of Faction provocatively titled “Stop trying to predict the future“. I say provocatively because, if I read the post correctly, Noel’s argument deliberately refutes his own headline. Noel wasn’t making a case against forecasting. Rather, he was arguing in favor of forecasting, as long as it’s done in service of social-scientific objectives.

If that’s right, then I largely agree with Noel’s argument and would restate it as follows. Political scientists shouldn’t get sucked into bickering with their colleagues over small differences in forecast accuracy around single events, because those differences will rarely contain enough information for us to learn much from them. Instead, we should take prediction seriously as a means of testing competing theories by doing two things.

First, we should build forecasting models that clearly represent contrasting sets of beliefs about the causes and precursors of the things we’re trying to predict. In Noel’s example, U.S. election forecasts are only scientifically interesting in so far as they come from models that instantiate different beliefs about why Americans vote like they do. If, for example, a model that incorporates information about trends in unemployment consistently produces more accurate forecasts than a very similar model that doesn’t, then we can strengthen our confidence that trends in unemployment shape voter behavior. If all the predictive models use only the same inputs—polls, for example—we don’t leave ourselves much room to learn about theories from them.

In my work for the Early Warning Project, I have tried to follow this principle by organizing our multi-model ensemble around a pair of models that represent overlapping but distinct ideas about the origins of state-led mass killing. One model focuses on the characteristics of the political regimes that might perpetrate this kind of violence, while another focuses on the circumstances in which those regimes might find themselves. These models embody competing claims about why states kill, so a comparison of their predictive accuracy will give us a chance to learn something about the relative explanatory power of those competing claims. Most of the current work on forecasting U.S. elections follows this principle too, by the way, even if that’s not what gets emphasized in media coverage of their work.

Second, we should only really compare the predictive power of those models across multiple events or a longer time span, where we can be more confident that observed differences in accuracy are meaningful. This is basic statistics. The smaller the sample, the less confident we can be that it is representative of the underlying distribution(s) from which it was drawn. If we declare victory or failure in response to just one or a few bits of feedback, we risk “correcting” for an unlikely draw that dimly reflects the processes that really interest us. Instead, we should let the models run for a while before chucking or tweaking them, or at least leave the initial version running while trying out alternatives.

Admittedly, this can be hard to do in practice, especially when the events of interest are rare. All of the applied forecasters I know—myself included—are tinkerers by nature, so it’s difficult for us to find the patience that second step requires. With U.S. elections, forecasters also know that they only get one shot every two or four years, and that most people won’t hear anything about their work beyond a topline summary that reads like a racing form from the horse track. If you’re at all competitive—and anyone doing this work probably is—it’s hard not to respond to that incentive. With the Early Warning Project, I worry about having a salient “miss” early in the system’s lifespan that encourages doubters to dismiss the work before we’ve really had a chance to assess its reliability and value. We can be patient, but if our intended audiences aren’t too, then the system could fail to get the traction it deserves.

Difficult doesn’t mean impossible, however, and I’m optimistic that political scientists will increasingly use forecasting in service of their search for more useful and more powerful theories. Journal articles that take this idea seriously are still rare birds, especially on things other than U.S. elections, but you occasionally spot them (Exhibit A and B). As Drew Linzer tweeted in response to Noel’s post, “Arguing over [predictive] models is arguing over assumptions, which is arguing over theories. This is exactly what [political science] should be doing.”

Advertisements

Forecasting Round-Up No. 2

N.B. This is the second in an occasional series of posts I’m expecting to do on forecasting miscellany. You can find the first one here.

1. Over at Bad Hessian a few days ago, Trey Causey asked, “Where are the predictions in sociology?” After observing how the accuracy of some well-publicized forecasts of this year’s U.S. elections has produced “growing public recognition that quantitative forecasting models can produce valid results,” Trey wonders:

If the success of these models in forecasting the election results is seen as a victory for social science, why don’t sociologists emphasize the value of prediction and forecasting more? As far as I can tell, political scientists are outpacing sociologists in this area.

I gather that Trey intended his post to stimulate discussion among sociologists about the value of forecasting as an element of theory-building, and I’m all for that. As a political scientist, though, I found myself focusing on the comparison Trey drew between the two disciplines, and that got me thinking again about the state of forecasting in political science. On that topic, I had two brief thoughts.

First, my simple answer to why forecasting is getting more attention from political scientists that it used to is: money! In the past 20 years, arms of the U.S. government dealing with defense and intelligence seem to have taken a keener interest in using tools of social science to try to anticipate various calamities around the world. The research program I used to help manage, the Political Instability Task Force (PITF), got its start in the mid-1990s for that reason, and it’s still alive and kicking. PITF draws from several disciplines, but there’s no question that it’s dominated by political scientists, in large part because the events it tries to forecast—civil wars, mass killings, state collapses, and such—are traditionally the purview of political science.

I don’t have hard data to back this up, but I get the sense that the number and size of government contracts funding similar work has grown substantially since the mid-1990s, especially in the past several years. Things like the Department of Defense’s Minerva Initiative; IARPA’s ACE Program; the ICEWS program that started under DARPA and is now funded by the Office of Naval Research; and Homeland Security’s START consortium come to mind. Like PITF, all of these programs are interdisciplinary by design, but many of the topics they cover have their theoretical centers of gravity in political science.

In other words, through programs like these, the U.S. government is now spending millions of dollars each year to generate forecasts of things political scientists like to think about. Some of that money goes to private-sector contractors, but some of it is also flowing to research centers at universities. I don’t think any political scientists are getting rich off these contracts, but I gather there are bureaucratic and career incentives (as well as intellectual ones) that make the contracts rewarding to pursue. If that’s right, it’s not hard to understand why we’d be seeing more forecasting come out of political science than we used to.

My second reaction to Trey’s question is to point out that there actually isn’t a whole lot of forecasting happening in political science, either. That might seem like it contradicts the first, but it really doesn’t. The fact is that forecasting has long been pooh-poohed in academic social sciences, and even if that’s changing at the margins in some corners of the discipline, it’s still a peripheral endeavor.

The best evidence I have for this assertion is the brief history of the American Political Science Association’s Political Forecasting Group. To my knowledge—which comes from my participation in the group since its establishment—the Political Forecasting Group was only formed several years ago, and its membership is still too small to bump it up to the “organized section” status that groups representing more established subfields enjoy. What’s more, almost all of the panels the group has sponsored so far have focused on forecasts of U.S. elections. That’s partly because those papers are popular draws in election years, but it’s also because the group’s leadership has had a really hard time finding enough scholars doing forecasting on other topics to assemble panels.

If the discipline’s flagship association in one of the countries most culturally disposed to doing this kind of work has trouble cobbling together occasional panels on forecasts of things other than elections, then I think it’s fair to say that forecasting still isn’t a mainstream pursuit in political science, either.

2. Speaking of U.S. election forecasting, Drew Linzer recently blogged a clinic in how statistical forecasts should be evaluated. Via his web site, Votamatic, Drew:

1) began publishing forecasts about the 2012 elections well in advance of Election Day (so there couldn’t be any post hoc hemming and hawing about what his forecasts really were);

2) described in detail how his forecasting model works;

3) laid out a set of criteria he would use to judge those forecasts after the election; and then

4) walked us through his evaluations soon after the results were (mostly) in.

Oh, and in case you’re wondering: Drew’s model performed very well, thank you.

3. But you know what worked a little better than Drew’s election-forecasting model, and pretty much everyone else’s, too? An average of the forecasts from several of them. As it happens, this pattern is pretty robust. A well-designed statistical model is great for forecasting, but an average of forecasts from a number of them is usually going to be even better. Just ask the weather guys.

4. Finally, for those of you—like me—who want to keep holding pundits’ feet to the fire long after the election’s over, rejoice that Pundit Tracker is now up and running, and they even have a stream devoted specifically to politics. Among other things, they’ve got John McLaughlin on the record predicting that Hillary Clinton will win the presidency in 2016, and that President Obama will not nominate Susan Rice to be Secretary of State. McLaughlin’s hit rate so far is a rather mediocre 49 percent (18 of 37 graded calls correct), so make of those predictions what you will.

  • Author

  • Follow me on Twitter

  • Follow Dart-Throwing Chimp on WordPress.com
  • Enter your email address to follow this blog and receive notifications of new posts by email.

    Join 13,631 other followers

  • Archives

  • Advertisements
%d bloggers like this: