Forecasting Popular Uprisings in 2011: How Are We Doing?

Earlier this year, I participated in a workshop at the Council on Foreign Relations on predicting political instability. Not surprisingly, the decision to hold the workshop was partially inspired by events in the Middle East and North Africa, and one of the questions on our minds was, “Can we do a better job anticipating these kinds of uprisings?”

To see how statistical modeling might answer that question, I applied a technique called Bayesian model averaging (BMA) to data from 1972 through 2009 to develop an algorithm that could be used to estimate the probability of nonviolent uprisings in 2011 (and future years) in countries worldwide. For this analysis, I used historical data on nonviolent rebellions produced by Professors Erica Chenoweth and Maria Stephan, whose work on the subject is described in this journal article and this op-ed. In their research, a nonviolent rebellion is defined as “a campaign of purposive, nonviolent mass events in pursuit of a political objective.” As I understand their definition and data, occasional demonstrations or strikes do not qualify; for protest activity to get tagged as a nonviolent rebellion, it must be directed at central state authority, involve large numbers of participants, and be sustained. For example, I’d say that popular uprisings in Egypt, Syria, and Tunisia in 2011 would meet their definition, while smaller and more sporadic protests so far this year in Georgia and Sudan would not. My statistical analysis looked at the onsets of these protest, whether or not one or more campaign was already occurring in the same country. Chenoweth and Stephan’s data set ends in 2006; I made informed guesses to extend it through 2009, and I changed a few historical observations with which I disagreed.

I’ll provide more information about the results of the statistical analysis toward the end of this post, but, to avoid losing readers who are less interested in those technical details, let me cut to the chase: Where have popular uprisings occurred so far this year, and how good was the statistical “model” at distinguishing those countries from ones that have not seen protest campaigns yet this year?

Let’s start with the first question. I have two lists of countries that have seen onsets of nonviolent rebellion in 2011 as of early June: one I generated, and one Erica Chenoweth shared with me on request (thank you, Erica!). Those lists overlap but are not identical. To reflect that uncertainty about where these campaigns are occurring, I will discuss both versions.

My list: Albania, Bahrain, Burkina Faso, Egypt, Syria, Tunisia, Uganda, Yemen
Erica’s list: Bahrain, Egypt, Iran, Libya, Oman, Saudi Arabia, Syria, Yemen

Erica and I both identify eight onsets of nonviolent rebellion so far in 2011, but there are only four countries that appear on both lists: Bahrain, Egypt, Syria, and Yemen. One of the differences is just a matter of timing; I put the start of the Tunisian uprising in 2011, but Erica sees it as having started in December 2010. The other points of divergence reflect differences in judgment and, probably, available information. I identify new nonviolent uprisings this year in Albania, Burkina Faso, and Uganda, where Erica sees them in Iran, Libya, Oman, and Saudi Arabia.

Those two lists give us alternate versions of a “ground truth” for the first half of 2011 against which we can compare our statistical forecasts. So how are those forecasts faring? The quickest answer comes in the form of the area under the receiver operating characteristic curve, or AUC, a quantity statisticians often use to summarize the accuracy of a model that’s meant to classify cases on either/or outcome. The AUC represents the probability that a given classifier–here, the forecasting algorithm produced by Bayesian model averaging–will rank a randomly selected positive case (onset) higher than a randomly selected negative case (non-onset). An AUC of 0.5 is what you’d expect to get from coin-flipping. A score in the 0.70s is good; a score in the 0.80s is very good; and a score in the 0.90s is excellent. The mid-year AUC scores for the two lists (with 95% confidence intervals) are as follows.

Out-of-sample AUC using my 2011 list: 0.75 (0.57, 0.94)
Out-of-sample AUC using Erica’s 2011 list: o.87 (0.79, 0.95)

The samples are very small (8 or 9 events in 163 countries with populations larger than 500,000), and the year is only half over, but those preliminary results are solid out-of-sample scores for a rare-events analysis, comparable to the accuracy rates achieved by state-of-the-art efforts to predict political instability. Based on those scores, I think it’s fair to say (tentatively) that this statistical tool is doing a pretty good job so far this year at assessing risks of popular uprisings that some people have argued are impossible to forecast well.

AUC scores are a useful summary statistic, but they don’t give us very specific information about how the forecasts would perform in ways that people might actually use them. One of the challenges when using statistical analysis to forecast rare events is that the distribution of the forecasts inevitably skews toward the base rate, which is close to zero (on average, only a few onsets of nonviolent rebellion occur each year). Because of that skewing, we sometimes focus on relative rather than absolute risk when using statistical estimates to try to forecast rare events. Instead of looking at the predicted probabilities and saying we don’t think any of these events are going to happen anywhere in 2011, we start by assuming that a few of them are going happen and then using our rank-ordered list to identify which countries are the most likely candidates. One way to compare the relative risks is to group the ordered data by quantile. Here, I’m going to use quintiles (i.e., fifths), referring to the first quintile as the “most likely” group, the second quintile as the “moderately likely” group, and the bottom three quintiles as the “least likely” group. It might seem more natural to use terciles (thirds) if we’re going to break the forecasts into three groups, but I think the 1:1:3 breakdown of quintiles is more useful with rare-events forecasts because it more accurately reflects the long tail in the distribution of the predicted probabilities (i.e., the fact that most of the forecasts are approximately 0).

So, using that categorization scheme, how have my BMA-based forecasts done so far in 2011? Once again, the results look pretty good. The bar plot below shows the number of onsets so far this year that have occurred in each quintile, using the two different lists. Each quintile comprises 32 or 33 countries. Of the eight cases I have called onsets, five were in the most-likely group (top quintile), two were in the less-likely group (2nd quintile), and one (Albania) was in the least-likely group (5th quintile). Of the eight cases where Erica sees onsets so far this year, six are in the most-likely group, and the other two are in the less-likely group (2nd quintile). Based on my experience trying to forecast rare forms of political instability, I would say that those are very good out-of-sample results.

Some of you are probably wondering what other countries this statistical tool identifies as being among the most likely to see onsets of nonviolent rebellion in 2011. The following dot plot shows, in descending order, 2011 forecasts for the 40 countries with the greatest likelihood that a nonviolent uprising would begin there at some point this year, according to the algorithm I got from Bayesian model averaging.

Some of you are probably also wondering what the statistical analysis underlying these forecasts tells us about the correlates of nonviolent rebellion. What follows is a list of the variables I included in the Bayesian model averaging exercise, based on my reading of prior theory and research and the availability of reasonably reliable time-series cross-sectional data at no cost.

Population Size. Relative to annual global median, logged.
Poverty. Infant mortality rate relative to annual global median, logged.
Urbanization. Percent of population living in urban areas.
Literacy. Adult literacy rate.
Mobile Telephony. Subscribers per 1,000 population.
Internet Usage. Users per 1,000 population.
Economic Growth. Year-to-year percent change in real GDP per capita.
Civil Liberties. Freedom House’s seven-point scale.
Democracy. Polity’s 21-point scale, rescaled to 0-10.
Regime Stability. Years since last change of 3+ points in Polity score, logged.
Recent Uprising. Indicator of any nonviolent rebellion in same country in previous year.
Recent Civil War. Indicator of any violent rebellion in same country in previous year.
Uprisings in Region. Count of countries in same region with nonviolent rebellion in previous year, logged.
ICCPR 1st Optional Protocol. Indicator of whether or not country has signed International Covenant on Civil and Political Rights’ 1st Optional Protocol, which gives citizens the right to petition the UN for alleged violations.
GATT/WTO Member. Indicator of whether or not country is a signatory to GATT or (after 1994) a member of the WTO.
Post-Cold War. Indicator for post-cold war period, identified as 1989 or later.
Colonial Legacies. Series of binary variables identifying country’s last colonizer (or lack thereof).

Those are the variables that went into the analysis, but only nine of them are actually influencing the forecasts. According to my statistical analysis, the rest are not particularly useful for predicting the onset of nonviolent rebellion. Here is a list of the nine that do, along with the posterior means and probabilities from Bayesian model averaging. [For those of you accustomed to reading results from a single regression model, the posterior mean is akin to an estimated coefficient, and the posterior probability is akin to 1 minus the p-value (so higher values indicate more confidence that the variable is a useful predictor of the outcome in question). Most of the coefficients are not on a common scale, so they shouldn’t be compared directly to each other.]

Population Size: 0.379 (100%)
Democracy Score: -0.033 (100%)
Literacy: 0.012 (67%)
Uprisings in Region: 0.132 (22%)
Civil Liberties: -0.072 (22%)
Post-Cold War: 0.029 (5%)
ICCPR Ist Optional Protocol: 0.025 (4%)
GATT/WTO Member: 0.021 (4%)
Economic Growth: -0.003 (3%)

I will not succumb to the temptation to draw causal inferences from these associations. Even without making that heroic leap, though, we can talk about the associations we see in these results. Other things being equal, nonviolent rebellions are more likely to occur…

In countries with the least democratic institutions;
In countries with more expansive civil liberties;
In countries with higher literacy rates;
When more uprisings are already occurring in regional neighbors;
In the post-cold war period;
In countries that belong to the WTO;
In countries that have signed the 1st Optional Protocol of the ICCPR: and
When economic growth is slower.

There are also some interesting negative findings here. According to my analysis, variables that are not particularly useful for forecasting nonviolent rebellion when the measures listed above are also in the mix include:

Poverty (as measured by infant mortality);
Cellular phone penetration (as measured by mobile phone subscribers per 1,000 population); and
Internet access (as measured by users per 1,000 population).

These three negative findings contradict many of the on-the-fly explanations I’ve read for the protests that are occurring this year in the Middle East and North Africa. It’s also worth pointing out that the association identified between economic growth and nonviolent uprisings is pretty tiny. It’s not quite zero, but it’s awfully close to it, and that result contradicts the prevailing belief that economic slowdowns are one of the, if not the, most important triggers to popular unrest.

On the whole, I think this exercise reaffirms the claim that we can get useful forecasts of rare forms of political instability, including popular uprisings, from statistical analysis of widely available country-level data. That doesn’t mean we can’t do even better. What’s needed to do this particular analysis better is higher-resolution data on dynamics of nonviolent rebellion. That kind of data would allow us to differentiate more subtly between situations like Egypt’s and, say, Sudan’s. Some scholars are doing excellent work right now using software to turn news reports into event data that should enable kind of analysis, but to the best of my knowledge, we’re not there yet.

24 Comments

by Jay Ulfelder on June 9, 2011 • Permalink

Posted in Forecasting, Protests and Popular Uprisings, Revolutions, Statistics

Posted by Jay Ulfelder on June 9, 2011

https://dartthrowingchimp.wordpress.com/2011/06/09/forecasting-popular-uprisings-in-2011-how-are-we-doing/

24 Comments

Michael Ross
/ June 9, 2011

This is great. Interestingly, last December Foreign Policy magazine listed a couple of dozen countries where conflicts were most likely to break out in 2010. It was published just a few weeks before the protests in Tunisia. The only Middle East countries on the list, if I remember correctly, were Lebanon and Iraq – two of the more peaceful spots in the region. This underscores how impressive your model is.

A couple of other thoughts: I often hear that the Mideast conflicts were triggered by economic growth that was unequal, or by corruption. Any reason you didn’t try a measure of inequality or corruption? I wouldn’t expect them to matter, but it would be interesting to show that they weren’t robust predictors.

And: presumably we don’t see rebellions in some countries where we might because the government is effective at anticipating and blocking them – maybe China and Saudi Arabia? Is there anything measurable that might capture this?

Overall, a very impressive job. I hope USG types are listening.

Reply
- dartthrowingchimp
  / June 10, 2011
  
  Thanks very much, Michael. In answer to your question, I would like to have included inequality and corruption in the mix of variables tested, but the missing-data problems are severe. The time-series cross-sectional data on both of those concepts are spotty (some countries have no observations) and shallow (they don’t extend very far back in time). So I decided I wouldn’t use them because it would have left me with a much smaller sample, and it would have meant leaving some countries out of the forecasts. On successful repression, I think the democracy and civil-liberties measures in the analysis are getting at this aspect in a gross way, but there are surely many subtleties that are missed. That would be an important area for improvement, especially in any future modeling that uses a more nuanced measure of protest activity as a dependent variable.
  
  Reply
Hein Goemans
/ June 10, 2011

Given that Erica generated most of the training data, it should come as no surprise she also does “better” in forecasting, no?

Reply
- dartthrowingchimp
  / June 10, 2011
  
  Right. Or, to put it another way, I would expect the forecasts to fit better with out-of-sample data generated by the same process (or, in this case, person) than with out-of-sample data generated by a different process, and that seems to be the case so far. There might be aspects of the original concept or definition I’m missing, or differences in source materials. That’s why I was so pleased that the forecasts also did well against my list. And, of course, both of those assessments come with the caveat that the samples are still very small.
  
  Reply
Philip Schrodt
/ June 11, 2011

Nice analysis. Two thoughts

1. As has probably been noted in this blog before, what would be really nice would be a better sense of the human baseline accuracy beyond that of, well, a dart-throwing chimp. Tetlock’s results still seem to be the best we have, but I think a problem with Tetlock is that we don’t have a sense of how intrinsically difficult his cases were, and my sense is that generally they were fairly difficult, hence the low accuracy.

The “Arab Spring”, however, is probably also fairly difficult and — unlike the 2008+ economic problems, where some people can make a pretty good case for anticipating those (and some people made a great deal of money in that manner) — I’ve yet to see qualitative analysts bragging that they correctly anticipated the Arab Spring. Any pointers to the contrary would be appreciated. My guess is that the human accuracy on a problem like this is probably better than 50%, but substantially less than an AUC of 0.80.

I have this vague memory that there was an exercise about ten years ago that was doing systematic qualitative forecasting for the Israeli-Palestinian conflict — probably during the time of the second intifada — on a quarterly basis, with the objective of getting a prediction accuracy baseline but I don’t recall anything coming out of this in term of publications.

2. If we go to a more micro-level, I could see the prediction component of this set of incidents being at least three issues

a. Some incident would set off a revolt in at least one of the vulnerable countries: presumably no one assumes that the entire thing was dependent solely on the decidedly low-probability self-immolation of a fruit vendor in Middle-of-Nowhere, Tunisia,

b. The fact that it would spread to Egypt, Libya, Bahrain, Yemen and Syria: while multiple-contagion situations such as this are relatively unusual, they are hardly unique (e.g. Eastern Europe 1989, Europe 1848, Southern USA 1861)

c. The fact that this has *not* [at least so far…] spread to Morocco, Algeria, Jordan, Kuwait and any of the other GCC states. Once this whole thing plays out, the negative cases may well be as interesting as the positive.

Reply
- dartthrowingchimp
  / June 12, 2011
  
  Thanks, Phil, for raising those very interesting questions (human baseline, microdynamics vs. macrodynamics, explaining negative cases). On assessing qualitative forecasts for the Middle East, the Stimson Center issued a report called “Seismic Shift” in May 2011 that took a non-systematic but informative look at that question.
  
  Reply
LFC
/ June 12, 2011

“other things being equal, non-violent rebellions are more likely to occur…in countries with more expansive civil liberties….”

I think you meant to write: “less expansive civil liberties” (given, among other things, the negative ‘posterior means’ for civil liberties and for democracy).

Reply
- dartthrowingchimp
  / June 13, 2011
  
  Thanks for the close read. Actually, the text there is correct. The confusion arises because the Freedom House index of civil liberties on which that estimate is based is not scaled intuitively. The index runs from 1 to 7, but 1 indicates the most expansive civil liberties and 7 the least. I usually rescale it before modeling to avoid that confusion but did not do so here. In any case, the result is substantively interesting for the reason you point out; the associations with degree of democracy (per Polity) and civil liberties (per Freedom House) point in opposite directions.
  
  Reply
  - LFC
    / June 13, 2011
    
    Thanks for clarifying.
Abhimanyu Arora
/ August 5, 2011

Dear Jay,
Thanks for the interesting read. I have a couple of technical concerns. Would be great to know your take on them and learn more about the nuances in BMA.
The result that internet and mobiles are insignificant…how do you see in light of the fact that they actually begun to take shape only in the last 10 years of the 40 year historical period?
Then there is a question of the news reports themselves—to what extent they themselves are ‘exogenous’, so to speak?

Reply
- dartthrowingchimp
  / August 5, 2011
  
  Thanks for reading and commenting, Abhimanyu.
  
  On the newness of the Internet and mobile phones, those started to show up more like 15-20 years ago, so the track record isn’t super-short. As I see it, if those technologies were strongly associated with an increased likelihood of a popular uprising, that should be enough time to see a pattern emerge. That said, it will be interesting to re-analyze the data with this year’s bumper crop of uprisings included and see if that has any effect on the estimates.
  
  Regarding the news reports, I’m not sure what you’re asking, but I think you’re referring to the fact that some of the variables in the analysis (e.g., urbanization, civil liberties) might be associated with the likelihood that civil resistance would get reported in the first place. I think that would be a big concern if this analysis were looking at micro-events (e.g., individual protests), but I believe it’s not a significant problem when the outcome of interest is the onset of a large and sustained campaign in the past few decades. Those are generally considered newsworthy everywhere, and they’re hard to hide.
  
  Reply
  - Abhimanyu Arora
    / August 5, 2011
    
    Thanks for the clarifications, Jay. You got my question in the right direction, what I also had in mind apart from the linkage between reportage and other covariates was a kind of trend (or a journalism bias?)—in the sense that events are faster (and more, perhaps) reported these days than in 1970, but your answer takes care of that as well.