Why My Coup Risk Models Don’t Include Any Measures of National Militaries

For the past several years (herehere, here, and here), I’ve used statistical models estimated from country-year data to produce assessments of coup risk in countries worldwide. I rejigger the models a bit each time, but none of the models I’ve used so far has included specific features of countries’ militaries.

That omission strikes a lot of people as a curious one. When I shared this year’s assessments with the Conflict Research Group on Facebook, one group member posted this comment:

Why do none of the covariates feature any data on militaries? Seeing as militaries are the ones who stage the coups, any sort of predictive model that doesn’t account for the militaries themselves would seem incomplete.

I agree in principle. It’s the practical side that gets in the way. I don’t include features of national militaries in the models because I don’t have reliable measures of them with the coverage I need for this task.

To train and then apply these predictive models, I need fairly complete time series for all or nearly all countries of the world that extend back to at least the 1980s and have been updated recently enough to give me a useful input for the current assessment (see here for more on why that’s true). I looked again early this month and still can’t find anything like that on even the big stuff, like military budgets, size, and force structures. There are some series on this topic in the World Bank’s World Development Indicators (WDI) data set, but those series have a lot of gaps, and the presence of those gaps is correlated with other features of the models (e.g., regime type). Ditto for SIPRI. And, of course, those aren’t even the most interesting features for coup risk, like whether or not military promotions favor certain groups over others, or if there is a capable and purportedly loyal presidential guard.

But don’t take my word for it. Here’s what the Correlates of War Project says in the documentation for Version 4.0 of its widely-used data set (PDF) about its measure of military expenditures, one of two features of national militaries it tries to cover (the other is total personnel):

It was often difficult to identify and exclude civil expenditures from reported budgets of less developed nations. For many countries, including some major powers, published military budgets are a catch-all category for a variety of developmental and administrative expenses—public works, colonial administration, development of the merchant marine, construction, and improvement of harbor and navigational facilities, transportation of civilian personnel, and the delivery of mail—of dubious military relevance. Except when we were able to obtain finance ministry reports, it is impossible to make detailed breakdowns. Even when such reports were available, it proved difficult to delineate “purely” military outlays. For example, consider the case in which the military builds a road that facilitates troops movements, but which is used primarily by civilians. A related problem concerns those instances in which the reported military budget does not reflect all of the resources devoted to that sector. This usually happens when a nation tries to hide such expenditures from scrutiny; for instance, most Western scholars and military experts agree that officially reported post-1945 Soviet-bloc totals are unrealistically low, although they disagree on the appropriate adjustments.

And that’s just the part of the “Problems and Possible Errors” section about observing the numerator in a calculation that also requires a complicated denominator. And that’s for what is—in principle, at least—one of the most observable features of a country’s civil-military relations.

Okay, now let’s assume that problem magically disappears, and COW’s has nearly-complete and reliable data on military expenditures. Now we want to use models trained on those data to estimate coup risk for 2015. Whoops: COW only runs through 2010! The World Bank and SIPRI get closer to the current year—observations through 2013 are available now—but there are missing values for lots of countries, and that missingness is caused by other predictors of coup risk, such as national wealth, armed conflict, and political regime type. For example, WDI has no data on military expenditures for Eritrea and North Korea ever, and the series for Central African Republic is patchy throughout and ends in 2010. If I wanted to include military expenditures in my predictive models, I could use multiple imputation to deal with these gaps in the training phase, but then how would I generate current forecasts for these important cases? I could make guesses, but how accurate could those guesses be for a case like Eritrea or North Korea, and then am I adding signal or noise to the resulting forecasts?

Of course, one of the luxuries of applied forecasting is that the models we use can lack important features and still “work.” I don’t need the model to be complete and its parameters to be true for the forecasts to be accurate enough to be useful. Still, I’ll admit that, as a social scientist by training, I find it frustrating to have to set aside so many intriguing ideas because we simply don’t have the data to try them.

Estimating NFL Team-Specific Home-Field Advantage

This morning, I tinkered a bit with my pro-football preseason team strength survey data from 2013 and 2014 to see what other simple things I might do to improve the accuracy of forecasts derived from future versions of them.

My first idea is to go beyond a generic estimate of home-field advantage—about 3 points, according to my and everyone else’s estimates—with team-specific versions of that quantity. The intuition is that some venues confer a bigger advantage than others. For example, I would guess that Denver enjoys a bigger home-field edge than most teams because their stadium is at high altitude. The Broncos live there, so they’re used to it, but visiting teams have to adapt, and that process supposedly takes about a day for every 1,000 feet over 3,000. Some venues are louder than others, and that noise is often dialed up when visiting teams would prefer some quiet. And so on.

To explore this idea, I’m using a simple hierarchical linear model to estimate team-specific intercepts after taking preseason estimates of relative team strength into account. The line of R code used to estimate the model requires the lme4 package and looks like this:

mod1 <- lmer(score.raw ~ wiki.diff + (1 | home_team), results)

Where

score.raw = home_score - visitor_score
wiki.diff = home_wiki - visitor_wiki

Those wiki vectors are the team strength scores estimated from preseason pairwise wiki surveys. The ‘results’ data frame includes scores for all regular and postseason games from those two years so far, courtesy of devstopfix’s NFL results repository on GitHub (here). Because the net game and strength scores are both ordered home to visitor, we can read those random intercepts for each home team as estimates of team-specific home advantage. There are probably other sources of team-specific bias in my data, so those estimates are going to be pretty noisy, because I think it’s a reasonable starting point.

My initial results are shown in the plot below, which I get with these two lines of code, the second of which requires the lattice package:

ha1 <- ranef(mod1, condVar=TRUE)
dotplot(ha1)

Bear in mind that the generic (fixed) intercept is 2.7, so the estimated home-field advantage for each team is what’s shown in the plot plus that number. For example, these estimates imply that my Ravens enjoy a net advantage of about 3 points when they play in Baltimore, while their division-rival Bengals are closer to 6.

home.advantage.estimates

In light of DeflateGate, I guess I shouldn’t be surprised to see the Pats at the top of the chart, almost a whole point higher than the second-highest team. Maybe their insanely home low fumble rate has something to do with it.* I’m also heartened to see relatively high estimates for Denver, given the intuition that started this exercise, and Seattle, which I’ve heard said enjoys an unusually large home-field edge. At the same time, I honestly don’t know what to make of the exceptionally low estimates for DC and Jacksonville, who appear from these estimates to suffer a net home-field disadvantage. That strikes me as odd and undercuts my confidence in the results.

In any case, that’s how far my tinkering took me today. If I get really bored motivated, I might try re-estimating the model with just the 2013 data and then running the 2014 preseason survey scores through that model to generate “forecasts” that I can compare to the ones I got from the simple linear model with just the generic intercept (here). The point of the exercise was to try to get more accurate forecasts from simple models, and the only way to test that is to do it. I’m also trying to decide if I need to cross these team-specific effects with season-specific effects to try to control for differences across years in the biases in the wiki survey results when estimating these team-specific intercepts. But I’m not there yet.

* After I published this post, Michael Lopez helpfully pointed me toward a better take on the Patriots’ fumble rate (here), and Mo Patel observed that teams manage their own footballs on the road, too, so that particular tweak—if it really happened—wouldn’t have a home-field-specific effect.

The State of the Art in the Production of Political Event Data

Peter Nardulli, Scott Althaus, and Matthew Hayes have a piece forthcoming in Sociological Methodology (PDF) that describes what I now see as the cutting edge in the production of political event data: machine-human hybrid systems.

If you have ever participated in the production of political event data, you know that having people find, read, and code data from news stories and other texts takes a tremendous amount of work. Even boutique data sets on narrowly defined topics for short time periods in single cases usually require hundreds or thousands of person-hours to create, and the results still aren’t as pristine as we’d like or often believe.

Contrary to my premature congratulation on GDELT a couple of years ago, however, fully automated systems are not quite ready to take over the task, either. Once a machine-coding system has been built, the data come fast and cheap, but those data are, inevitably, still pretty noisy. (On that point, see here for some of my own experiences with GDELT and here, here, here, here, and here for other relevant discussions.)

I’m now convinced that the best current solution is one that borrows strength from both approaches—in other words, a hybrid. As Nardulli, Althaus, and Hayes argue in their forthcoming article, “Machine coding is no simple substitute for human coding.”

Until fully automated approaches can match the flexibility and contextual richness of human coding, the best option for generating near-term advances in social science research lies in hybrid systems that rely on both machines and humans for extracting information from unstructured texts.

As you might expect, Nardulli & co. have built and are operating such a system—the Social, Political, and Economic Event Database (SPEED)—to code data on a bunch of interesting things, including coups and civil unrest. Their hybrid process goes beyond supervised learning, where an algorithm gets trained on a data set carefully constructed by human coders and then put in the traces to make new data from fresh material. Instead, adopt a “progressive supervised-learning system,” which basically means two things:

  1. They keep humans in the loop for all steps where the error rate from their machine-only process remains intolerably high, making the results as reliable as possible; and
  2. They use those humans’ coding decisions as new training sets to continually check and refine their algorithms, gradually shrinking the load borne by the humans and mitigating the substantial risk of concept drift that attaches to any attempt to automate the extraction of data from a constantly evolving news-media ecosystem.

I think SPEED exemplifies the state of the art in a couple of big ways. The first is the process itself. Machine-learning processes have made tremendous gains in the past several years (see here, h/t Steve Mills), but we still haven’t arrived at the point where we can write algorithms that reliably recognize and extract the information we want from the torrent of news stories coursing through the Internet. As long as that’s the case—and I expect it will be for at least another several years—we’re going to need to keep humans in the loop to get data sets we really trust and understand. (And, of course, even then the results will still suffer from biases that even a perfect coding process can’t avoid; see here for Will Moore’s thoughtful discussion of that point.)

The second way in which SPEED exemplifies the state of the art is what Nardulli, Althaus, and Hayes’ paper explicitly and implicitly tells us about the cost and data-sharing constraints that come with building and running a system of this kind on that scale. Nardulli & co. don’t report exactly how much money has been spent on SPEED so far and how much it costs to keep running it, but they do say this:

The Cline Center began assembling its news archive and developing SPEED’s workflow system in 2006, but lacked an operational cyberinfrastructure until 2009. Seven years and well over a million dollars later, the Cline Center released its first SPEED data set.

Partly because of those high costs and partly because of legal issues attached to data coded from many news stories, the data SPEED produces are not freely available to the public. The project shares some historical data sets on its web site, but the content of those sets is limited, and the near-real-time data coveted by applied researchers like me are not made public. Here’s how the authors describe their situation:

While agreements with commercial vendors and intellectual property rights prohibit the Center from distributing its news archive, efforts are being made to provide non-consumptive public access to the Center’s holdings. This access will allow researchers to evaluate the utility of the Center’s digital archive for their needs and construct a research design to realize those needs. Based on that design, researchers can utilize the Center’s various subcenters of expertise (document classification, training, coding, etc.) to implement it.

I’m not happy about those constraints, but as someone who has managed large and costly social-science research projects, I certainly understand them. I also don’t expect them to go away any time soon, for SPEED or for any similar undertaking.

So that’s the state of the art in the production of political event data: Thanks to the growth of the Internet and advances in computing hardware and software, we can now produce political event data on a scale and at a pace that would have had us drooling a decade ago, but the task still can’t be fully automated without making sacrifices in data quality that most social scientists should be uncomfortable making. The best systems we can build right now blend machine learning and automation with routine human involvement and oversight. Those systems are still expensive to build and run, and partly because of that, we should not expect their output to stream onto our virtual desktops for free, like manna raining down from digital heaven.

Statistical Assessments of Coup Risk for 2015

Which countries around the world are more likely to see coup attempts in 2015?

For the fourth year in a row, I’ve used statistical models to generate one answer to that question, where a coup is defined more or less as a forceful seizure of national political authority by military or political insiders. (I say “more or less” because I’m blending data from two sources with slightly different definitions; see below for details.) A coup doesn’t need to succeed to count as an attempt, but it does need to involve public action; alleged plots and rumors of plots don’t qualify. Neither do insurgencies or foreign invasions, which by definition involve military or political outsiders. The heat map below shows variation in estimated coup risk for 2015, with countries colored by quintiles (fifths).

forecast.heatmap.2015

The dot plot below shows the estimates and their 90-percent confidence intervals (CIs) for the 40 countries with the highest estimated risk. The estimates are the unweighted average of forecasts from two logistic regression models; more on those in a sec. To get CIs for estimates from those two models, I took a cue from a forthcoming article by Lyon, Wintle, and Burgman (fourth publication listed here; the version I downloaded last year has apparently been taken down, and I can’t find another) and just averaged the CIs from the two models.

forecast.dotplot.2015

I’ve consistently used simple two– or three-model ensembles to generate these coup forecasts, usually pairing a logistic regression model with an implementation of Random Forests on the same or similar data. This year, I decided to use only a pair of logistic regression models representing somewhat different ideas about coup risk. Consistent with findings from other work in which I’ve been involved (here), k-fold cross-validation told me that Random Forests wasn’t really boosting forecast accuracy, and sticking to logistic regression makes it possible to get and average those CIs. The first model matches one I used last year, and it includes the following covariates:

  • Infant mortality rate. Deaths of children under age 1 per 1,000 live births, relative to the annual global median, logged. This measure that primarily reflects national wealth but is also sensitive to variations in quality of life produced by things like corruption and inequality. (Source: U.S. Census Bureau)
  • Recent coup activity. A yes/no indicator of whether or not there have been any coup attempts in that country in the past five years. I’ve tried logged event counts and longer windows, but this simple version contains as much predictive signal as any other. (Sources: Center for Systemic Peace and Powell and Thyne)
  • Political regime type. Following Fearon and Laitin (here), a categorical measure differentiating between autocracies, anocracies, democracies, and other forms. (Source: Center for Systemic Peace, with hard-coded updates for 2014)
  • Regime durability. The “number of years since the last substantive change in authority characteristics (defined as a 3-point change in the POLITY score).” (Source: Center for Systemic Peace, with hard-coded updates for 2014)
  • Election year. A yes/no indicator for whether or not any national elections (executive, legislative, or general) are scheduled to take place during the forecast year. (Source: NELDA, with hard-coded updates for 2011–2015)
  • Economic growth. The previous year’s annual GDP growth rate. To dampen the effects of extreme values on the model estimates, I take the square root of the absolute value and then multiply that by -1 for cases where the raw value less than 0. (Source: IMF)
  • Political salience of elite ethnicity. A yes/no indicator for whether or not the ethnic identity of national leaders is politically salient. (Source: PITF, with hard-coded updates for 2014)
  • Violent civil conflict. A yes/no indicator for whether or not any major armed civil or ethnic conflict is occurring in the country. (Source: Center for Systemic Peace, with hard-coded updates for 2014)
  • Country age. Years since country creation or independence, logged. (Source: me)
  • Coup-tagion. Two variables representing (logged) counts of coup attempts during the previous year in other countries around the world and in the same geographic region. (Source: me)
  • Post–Cold War period. A binary variable marking years after the disintegration of the USSR in 1991.
  • Colonial heritage. Three separate binary indicators identifying countries that were last colonized by Great Britain, France, or Spain. (Source: me)

The second model takes advantage of new data from Geddes, Wright, and Frantz on autocratic regime types (here) to consider how qualitative differences in political authority structures and leadership might shape coup risk—both directly, and indirectly by mitigating or amplifying the effects of other things. Here’s the full list of covariates in this one:

  • Infant mortality rate. Deaths of children under age 1 per 1,000 live births, relative to the annual global median, logged. This measure that primarily reflects national wealth but is also sensitive to variations in quality of life produced by things like corruption and inequality. (Source: U.S. Census Bureau)
  • Recent coup activity. A yes/no indicator of whether or not there have been any coup attempts in that country in the past five years. I’ve tried logged event counts and longer windows, but this simple version contains as much predictive signal as any other. (Sources: Center for Systemic Peace and Powell and Thyne)
  • Regime type. Using the binary indicators included in the aforementioned data from Geddes, Wright, and Frantz with hard-coded updates for the period 2011–2014, a series of variables differentiating between the following:
    • Democracies
    • Military autocracies
    • One-party autocracies
    • Personalist autocracies
    • Monarchies
  • Regime duration. Number of years since the last change in political regime type, logged. (Source: Geddes, Wright, and Frantz, with hard-coded updates for the period 2011–2014)
  • Regime type * regime duration. Interactions to condition the effect of regime duration on regime type.
  • Leader’s tenure. Number of years the current chief executive has held that office, logged. (Source: PITF, with hard-coded updates for 2014)
  • Regime type * leader’s tenure. Interactions to condition the effect of leader’s tenure on regime type.
  • Election year. A yes/no indicator for whether or not any national elections (executive, legislative, or general) are scheduled to take place during the forecast year. (Source: NELDA, with hard-coded updates for 2011–2015)
  • Regime type * election year. Interactions to condition the effect of election years on regime type.
  • Economic growth. The previous year’s annual GDP growth rate. To dampen the effects of extreme values on the model estimates, I take the square root of the absolute value and then multiply that by -1 for cases where the raw value less than 0. (Source: IMF)
  • Regime type * economic growth. Interactions to condition the effect of economic growth on regime type.
  • Post–Cold War period. A binary variable marking years after the disintegration of the USSR in 1991.

As I’ve done for the past couple of years, I used event lists from two sources—the Center for Systemic Peace (about halfway down the page here) and Jonathan Powell and Clayton Thyne (Dataset 3 here)—to generate the historical data on which those models were trained. Country-years are the unit of observation in this analysis, so a country-year is scored 1 if either CSP or P&T saw any coup attempts there during those 12 months and 0 otherwise. The plot below shows annual counts of successful and failed coup attempts in countries worldwide from 1946 through 2014 according to the two data sources. There is a fair amount of variance in the annual counts and the specific events that comprise them, but the basic trend over time is the same. The incidence of coup attempts rose in the 1950s; spiked in the early 1960s; remained relatively high throughout the rest of the Cold War; declined in the 1990s, after the Cold War ended; and has remained relatively low throughout the 2000s and 2010s.

Annual counts of coup events worldwide from two data sources, 1946-2014

Annual counts of coup events worldwide from two data sources, 1946-2014

I’ve been posting annual statistical assessments of coup risk on this blog since early 2012; see here, here, and here for the previous three iterations. I have rejiggered the modeling a bit each year, but the basic process (and the person designing and implementing it) has remained the same. So, how accurate have these forecasts been?

The table below reports areas under the ROC curve (AUC) and Brier scores (the 0–1 version) for the forecasts from each of those years and their averages, using the the two coup event data sources alone and together as different versions of the observed ground truth. Focusing on the “either” columns, because that’s what I’m usually using when estimating the models, we can see the the average accuracy—AUC in the low 0.80s and Brier score of about 0.03—is comparable to what we see in many other country-year forecasts of rare political events using a variety of modeling techniques (see here). With the AUC, we can also see a downward trend over time. With so few events involved, though, three years is too few to confidently deduce a trend, and those averages are consistent with what I typically see in k-fold cross-validation. So, at this point, I suspect those swings are just normal variation.

AUC and Brier scores for coup forecasts posted on Dart-Throwing Chimp, 2012-2014, by coup event data source

AUC and Brier scores for coup forecasts posted on Dart-Throwing Chimp, 2012-2014, by coup event data source

The separation plot designed by Greenhill, Ward, and Sacks (here) offers a nice way to visualize the accuracy of these forecasts. The ones below show the three annual slices using the “either” version of the outcome, and they reinforce the story told in the table: the forecasts have correctly identified most of the countries that saw coup attempts in the past three years as relatively high-risk cases, but the accuracy has declined over time. Let’s define a surprise as a case that fell outside the top 30 of the ordered forecasts but still saw a coup attempt. In 2012, just one of four countries that saw coup attempts was a surprise: Papua New Guinea, ranked 48. In 2013, that number increased to two of five (Eritrea at 51 and Egypt at 58), and in 2014 it rose to three of five (Burkina Faso at 42, Ukraine at 57, and the Gambia at 68). Again, though, the average accuracy across the three years is consistent with what I typically see in k-fold cross-validation of these kinds of models in the historical data, so I don’t think we should make too much of that apparent time trend just yet.

cou.scoring.sepplot.2012 cou.scoring.sepplot.2013 cou.scoring.sepplot.2014

This year, for the first time, I am also running an experiment in crowdsourcing coup risk assessments by way of a pairwise wiki survey (survey here, blog post explaining it here, and preliminary results discussed here). My long-term goal is to repeat this process numerous times on this topic and some others (for example, onsets of state-led mass killing episodes) to see how the accuracy of the two approaches compares and how their output might be combined. Statistical forecasts are usually much more accurate than human judgment, but that advantage may be reduced or eliminated when we aggregate judgments from large and diverse crowds, or when we don’t have data on important features to use in those statistical models. Models that use annual data also suffer in comparison to crowdsourcing processes that can update continuously, as that wiki survey does (albeit with a lot of inertia).

We can’t incorporate the output from that wiki survey into the statistical ensemble, because the survey doesn’t generate predicted probabilities; it only assesses relative risk. We can, however, compare the rank orderings the two methods produce. The plot below juxtaposes the rankings produced by the statistical models (left) with the ones from the wiki survey (right). About 500 votes have been cast since I wrote up the preliminary results, but I’m going to keep things simple for now and use the preliminary survey results I already wrote up. The colored arrows identify cases ranked at least 10 spots higher (red) or lower (blue) by the crowd than the statistical models. As the plot shows, there are many differences between the two, even toward the top of the rankings where the differences in statistical estimates are bigger and therefore more meaningful. For example, the crowd sees Nigeria, Libya, and Venezuela as top 10 risks while the statistical models do not; of those three, only Nigeria ranks in the top 30 on the statistical forecasts. Meanwhile, the crowd pushes Niger and Guinea-Bissau out of the top 10 down to the 20s, and it sees Madagascar, Afghanistan, Egypt, and Ivory Coast as much lower risks than the models do. Come 2016, it will be interesting to see which version was more accurate.

coup.forecast.comparison.2015

If you are interested in getting hold of the data or R scripts used to produce these forecasts and figures, please send me an email at ulfelder at gmail dot com.

A Crowd’s-Eye View of Coup Risk in 2015

A couple of weeks ago (here), I used the blog to launch an experiment in crowdsourcing assessments of coup risk for 2015 by way of a pairwise wiki survey. The survey is still open and will stay that way until the end of the year, but with nearly 2,700 pairwise votes already cast, I thought it was good time to take stock of the results so far.

Before discussing those results, though, let me say thank you to all the people who voted in the survey or shared the link. These data don’t materialize from thin air. They only exist because busy people contributed their knowledge and time, and I really appreciate all of those contributions.

Okay, so, what does that self-assembled crowd think about relative risks of coup attempts in 2015? The figure below maps the country scores produced from the votes cast so far. Darker grey indicates higher risk. PLEASE NOTE: Those scores fall on a 0–100 scale, but they are not estimated probabilities of a coup attempt. Instead, they are only measures of relative risk, because that’s all we can get from a pairwise wiki survey. Coup attempts are rare events—in most recent years, we’ve seen fewer than a handful of them worldwide—so the safe bet for nearly every country every year is that there won’t be any coup attempts this year.

wikisurvey.couprisk.2015.map

 

Smaller countries can be hard to find on that map, and small differences in scores can be hard to discern, so I also like to have a list of the results to peruse. Here’s a dot plot with countries in descending order by model score. (It’d be nice to make this table sortable so you could also look for countries alphabetically, but my Internet fu is not up to that task.)

wikisurvey.couprisk.2015.dotplot

This survey is open to the public, and participants may cast as many votes as they like in as many sessions as they like. The scores summarized above come from nearly 2,700 votes cast between the morning of January 3, when I published the blog post about the survey, and the morning of January 14, when I downloaded a report on the current results. At present, this blog has a few thousand followers on Wordpress and a few hundred email subscribers. I also publicized the survey twice on Twitter, where I have approximately 6,000 followers: once when I published the initial blog post, and again on January 13. As the plot below shows, participation spiked around both of those pushes and was low otherwise.

votesovertime.20150114

The survey instrument does not collect identifying information about participants, so it is impossible to describe the make-up of the crowd. What we do know is that those votes came from about 100 unique user sessions. Some people probably participated more than once—I know that I cast a dozen or so votes on a few occasions—so 100 unique sessions probably works out to something like 80 or 90 individuals. But that’s a guess.

usersessions.20150114

We also know that those votes came from lots of different parts of the world. As the map below shows, most of the votes came from the U.S., Europe, and Australia, but there were also pockets of activity in the Middle East (especially Israel), Latin America (Brazil and Argentina), Africa (Cote d’Ivoire and Rwanda), and Asia (Thailand and Bangladesh).

votemap.20150114

I’ll talk a little more about the substance of these results when I publish my statistical assessments of coup risk for 2015, hopefully in the next week or so. Meanwhile, number-crunchers can get a .csv with the data used to generate the map and table in this post from my Google Drive (here) and the R script from GitHub (here). If you’re interested in seeing the raw vote-level data from which those scores were generated, drop me a line.

A Forecast of Global Democratization Trends Through 2025

A couple of months ago, I was asked to write up my thoughts on global trends in democratization over the next five to 10 years. I said at the time that, in coarse terms, I see three plausible alternative futures: 1) big net gains, 2) big net losses, and 3) little net change.

  • By big net gains, I mean a rise in the prevalence of democratic regimes above 65 percent, or, or, because of its size and geopolitical importance, democratization in China absent a sharp decline in the global prevalence of democracy. For big net gains to happen, we would need to see a) one or more clusters of authoritarian breakdown and subsequent democratization in the regions where such clusters are still possible, i.e., Asia, the former Soviet Union, and the Middle East and North Africa (or the aforementioned transition in China); and b) no sharp losses in the regions where democracy is now prevalent, i.e., Europe, the Americas, and sub-Saharan Africa. I consider (a) unlikely but possible (see here) and (b) highly likely. The scenario requires both conditions, so it is unlikely.
  • By big net losses, I mean a drop in the global prevalence of democracy below 55 percent. For that to happen, we would need to see the opposite of big net gains—that is, a) no new clusters of democratization and no democratization in China and b) sharp net losses in one or more of the predominantly democratic regions. In my judgment, (a) is likely but (b) is very unlikely. This outcome depends on the conjunction of (a) and (b), so the low probability of (b) means this outcome is highly unlikely. A reversion to autocracy somewhere in Western Europe or North America would also push us into “big net loss” territory, but I consider that event extremely unlikely (see here and here for why).
  • In the absence of either of these larger shifts, we will probably see little net change in the pattern of the past decade or so: a regular trickle of transitions to and from democracy at rates that are largely offsetting, leaving the global prevalence hovering between 55 and 65 percent. Of course, we could also wind up with little net change in the global prevalence of democracy under a scenario in which some longstanding or otherwise significant authoritarian regimes—for example, China, Russia, Iran, or Saudi Arabia— break down, and those breakdowns spread to interdependent regimes, but most of those breakdowns lead to new authoritarian regimes or short-lived attempts at democracy. This is what we saw in the Arab Spring, and base rates from the past several decades suggest that it is the most likely outcome of any regional clusters of authoritarian breakdown in the next decade or so as well. I consider this version of the little-net-change outcome to be more likely than the other one (offsetting trickles of transitions to and from democracy with no new clusters of regime breakdown). Technically, we could also get to an outcome of little net change through a combination of big net losses in predominantly democratic regions and big gains in predominantly authoritarian regions, but I consider this scenario so unlikely in the next five to 10 years that it’s not worth considering in depth.

I believe the probabilities of big net gains and persistence of current levels are both much greater than the probability of big net losses. In other words, I am generally bullish. For the sake of clarity, I would quantify those guesses as follows:

  • Probability of big net gains: 20 percent
  • Probability of little net change: 75 percent
    • With regime breakdown in one or more critical autocracies: 60 percent
    • Without regime breakdown in any critical autocracies: 15 percent
  • Probability of big net losses: 5 percent

That outlook is informed by a few theoretical and empirical observations.

First, when I talk about democratization, I have in mind expansions of the breadth, depth, and protection of consultation between national political regimes and their citizens. As Charles Tilly argues on p. 24 of his 2007 book, Democracy, “A regime is democratic to the degree that political relations between the state and its citizens feature broad, equal, protected, and mutually binding consultation.” Fair and competitive elections are the most obvious and in some ways the most important form this consultation can take, but they are not the only one. Still, for purposes of observing broad trends and coarsely comparing cases, we can define a democracy as a regime in which officials who actually rule are chosen through fair and competitive elections in which nearly all adult citizens can vote. The fairness of elections depends on the existence of numerous civil liberties, including freedoms of speech, assembly, and association, and the presence of a reasonably free press, so this is not a low bar. Freedom House’s list of electoral democracies is a useful proxy for this set of conditions.

Second, we do not understand the causal processes driving democratization well, and we certainly don’t understand them well enough to know how to manipulate them in order to reliably produce desired outcomes. The global political economy, and the political economies of the states that comprise one layer of it, are parts of a complex adaptive system. This system is too complex for us to model and understand in ways that are more than superficial, partly because it continues to evolve as we try to understand and manipulate it. That said, we have seen some regularities in this system over the past half-century or so:

  • States are more likely to try and then to sustain democratic regimes as their economies grow, their economies become more complex, and their societies transform in ways associated with those trends (e.g., live longer, urbanize, and become more literate). These changes don’t produce transitions, but they do create structural conditions that are more conducive to them.
  • Oil-rich countries have been the exceptions to this pattern, but even they are not impervious (e.g., Mexico, Indonesia). Specifically, they are more susceptible to pressures to democratize when their oil income diminishes, and variation over time in that income depends, in part, on forces beyond their control (e.g., oil prices).
  • Consolidated single-party regimes are the most resilient form of authoritarian rule. Personalist dictatorships are also hard to topple as long as the leader survives but often crumble when that changes. Military-led regimes that don’t evolve into personalist or single-party autocracies rarely last more than a few years, especially since the end of the Cold War.
  • Most authoritarian breakdowns occur in the face of popular protests, and those protests are more likely to happen when the economy is slumping, when food or fuel prices are spiking, when protests are occurring in nearby or similar countries, and around elections. Signs that elites are fighting amongst themselves may also help to spur protests, but elite splits are common in autocracies and often emerge in reaction to protests, not ahead of them.
  • Most attempts at democracy end with a reversion to authoritarian rule, but the chances that countries will try again and then that democracy will stick improve as countries get richer and have tried more times before. The origins of the latter pattern are unclear, but they probably have something to do with the creation of new forms of social and political organization and the subsequent selection and adaptation of those organizations into “fitter” competitors under harsh pressures.

Third, whatever its causes, there is a strong empirical trend toward democratization around the world. Since the middle of the twentieth century, both the share of regimes worldwide that are democratic and the share of the global population living in democratic regimes have expanded dramatically. These expansions have not come steadily, and there is always some churn in the system, but the broader trend persists in spite of those dips and churn

The strength and, so far, persistence of this trend lead me to believe that the global system would have to experience a profound collapse or transformation for that trend to be disrupted. Under the conditions that have prevailed for the past century or so, selection pressures in the global system seem to be running strongly in favor of democratic political regimes with market-based economies.

Crucially, this long-term trend has also proved resilient to the global financial crisis that began in 2007-2008 and has persisted to some degree ever since. This crisis was as sharp a stress test of many national political regimes as we have seen in a while, perhaps since World War II. Democracy has survived this test in all of the world’s wealthy countries, and there was no stampede away from democracy in less wealthy countries with younger regimes. Freedom House and many other activists lament the occurrence of a “democratic recession” over the past several years, but global data just don’t support the claim that one is occurring. What we have seen instead is a slight decline in the prevalence of democratic regimes accompanied by a deepening of authoritarian rule in many of the autocracies that survived the last flurry of democratic transitions.

Meanwhile, some authoritarian regimes in the Middle East and North Africa broke down in the face of uprisings demanding greater popular accountability, and some of those breakdowns led to attempts at democratization—in Tunisia, Egypt, and Libya in particular. Most of those attempts at democratization have since failed, but not all did, Tunisia being the notable exception. What’s more, the popular pressure in favor of democratization has not dissipated in all of the cases where authoritarian breakdown didn’t happen. Bahrain, Kuwait, and, to a lesser extent, Saudi Arabia are notable in this regard.

Rising pressures on China and Russia suggest that similar clusters of regime instability are increasingly likely in their respective neighborhoods, even if they remain unlikely in any given year. China faces significant challenges on numerous fronts, including a slowing economy, a looming real-estate debt crisis, swelling popular frustration over industrial pollution, an uptick in labor activism, an anti-corruption campaign that could alienate some political and military insiders, and a separatist insurgency in Xinjiang. No one of those challenges is necessarily likely to topple the regime, but the presence of so many of them at once adds up to a significant risk (or opportunity, depending on one’s perspective). A regime crisis in China could ripple through its region with strongest effect on the most dependent regimes—on North Korea in particular, but also perhaps Vietnam, Laos, and Myanmar. Even if a crisis there didn’t reverberate, China’s population size and rising international influence imply that any movement toward democracy would have a significant impact on the global balance sheet.

The Russian regime is also under increased pressure, albeit for different reasons. Russia is already in recession, and falling oil prices and capital flight are making things much worse without much promise of near-term relief. U.S. and E.U. sanctions deserve significant credit (or blame) for the acceleration of capital flight, and prosecution of the war in Ukraine is also imposing additional direct costs on Russia’s power resources. The extant regime has survived domestic threats before, but 10 more years is a long time for a regime that stands on feet of socioeconomic clay.

Above all else, these last two points—about 1) the resilience of existing democracies to the stress of the past several years and 2) the persistence and even deepening of pressures on many surviving authoritarian regimes—are what make me bullish about the prospects for democracy in next five to 10 years. In light of current trends in China and Russia, I have a hard time imagining both of those regimes surviving to 2025. Democratization might not follow, and if it does, it won’t necessarily stick, at least not right away. Neither regime can really get a whole lot more authoritarian than it is now, however, so the possibilities for change on this dimension are nearly all on the upside. (The emergence of a new authoritarian regime that is more aggressive abroad is also possible in both cases, but that topic is beyond the scope of this memo.)

Talk about the possibility of a wave of democratic reversals usually centers on the role China or Russia might play as either an agent of de-democratization or example of an alternative future. As noted above, though, both of these systems are currently facing substantial stresses at home. These stresses both limit their ability to act as agents of de-democratization and take the shine off any example they might set.

In short, I think that talk of Russia and China’s negative influence on the global democratization trend is overblown. Apart from the (highly unlikely) invasion and successful occupation of other countries, I don’t think either of these governments has the ability to undo democratization elsewhere. Both can and do help some other authoritarian regimes survive, however, and this is why regime crisis or breakdown in either one of them has the potential to catalyze new clusters of regime instability in their respective neighborhoods.

What do you think? If you made it this far and have any (polite) reactions you’d like to share, please leave a comment.

Schrodinger’s Coup

You’ve heard of Schrödinger’s cat, right? This is the famous thought experiment proposed by Nobel Prize–winning physicist Erwin Schrödinger to underscore what he saw as the absurdity of quantum superposition—the idea that “an object in a physical system can simultaneously exist in all possible configurations, but observing the system forces the system to collapse and forces the object into just one of those possible states.”

Schrodinger designed his thought experiment to refute the idea that a physical object could simultaneously occupy multiple physical states. At the level of whole cats, anyway, I’m convinced.

When it comes to coups, though, I’m not so sure. Arguments over whether or not certain events were or were not coups or coup attempts usually involve reasonable disagreements over definitions, but fundamental uncertainty about the actions and intentions involved often plays a role, too—especially in failed attempts. Certain sets of events exist in a perpetual state of ambiguity, simultaneously coup and not-coup with no possibility of our ever observing the definitive empirical facts that would force the cases to collapse into a single, clear condition.

Two recent examples help show what I mean. The first is last week’s coup/not coup attempt in the Gambia. From initial reports, it seemed pretty clear that some disgruntled soldiers had tried and failed to seize power while the president was traveling. That’s a classic coup scenario, and all the elements present in most coup definitions were there: military or political insiders seeking to overthrow the government through the use or threat of force.

This week, though, we hear that the gunmen in question were diasporans who hatched the plot abroad without any help on the inside. As the New York Times reported,

According to the [Justice Department’s] complaint, filed in federal court in Minnesota, the plot to topple Mr. Jammeh was hatched in October. Roughly a dozen Gambians in the United States, Germany, Britain and Senegal were involved in the plot, the complaint said. The plotters apparently thought, mistakenly, that members of the Gambian armed forces would join their cause…

The plot went awry when State House guards overwhelmed the attackers with heavy fire, leaving many dead or wounded. Mr. Faal and Mr. Njie escaped and returned to the United States, where they were arrested, the complaint said.

So, were the putschists really just a cabal of outsiders, in which case this event would not qualify as a coup attempt under most academic definitions? Or did they have collaborators on the inside who were discovered or finked out at the crucial moment, making a coup attempt look like a botched raid? The Justice Department’s complaint implies the former, but we’ll never know for sure.

Lesotho offers a second recent example of coup-tum superposition. In late August, that country’s prime minister, Thomas Thabane, fled to neighboring South Africa and cried “Coup!” after soldiers shut down radio stations and surrounded his residence and police headquarters in the capital. But, as Kristen van Schie reported for Al Jazeera,

Not so, said the military. It claimed the police were planning on arming UTTA—the government-aligned youth movement accused of planning to disrupt Monday’s march [against Thabane]. It was not so much a coup as a preventative anti-terrorism operation, it said.

The prime minister and the South African government continued to describe the event as a coup attempt despite that denial, but other observers disagreed. As analyst Dimpho Motsamai told van Schie, “Can one call it a coup when the military haven’t declared they’ve taken over government?” Maybe this really was just a misunderstanding accelerated by the country’s persistent factional crisis.

This uncertainty is generic to the study of political behavior, where the determination of a case’s status depends, in part, on the actors’ intentions, which can never be firmly established. Did certain members of the military in the Gambia mean to cooperate with the diasporans who shot their way toward the presidential last week, only to quit or get stopped before the decisive moment arrived? Was the commander of Lesotho’s armed forces planning to oust the prime minister when he ordered soldiers out of the barracks in late August, only to change his mind and tune after Thabane escaped the country?

To determine with certainty whether or not these events were coup attempts, we need clear answers to those questions, but we can’t get them. Instead, we can only see the related actions, and even those are incompletely and unreliably reported in most cases. Sometimes we get post hoc descriptions and explanations of those actions from the participants or close observers, but humans are notoriously unreliable reporters of their own intentions, especially in high-visibility, high-stakes situations like these.

Because this problem is fundamental to the study of political behavior, the best we can do is acknowledge it and adjust our estimations and inferences accordingly. When assembling data on coup attempts for comparative analysis, instead of just picking one source, we might use Bayesian measurement models to try to quantify this collective uncertainty (see here for a related example). Then, before reporting new findings on the causes or correlates of coup attempts, we might ask: which cases are more ambiguous than the others, and how would their removal from or addition to the sample alter our conclusions?

A Few Rules of Thumb for Data Munging in Political Science

1. However hard you think it will be to assemble a data set for a particular analysis, it will be exponentially harder, with the size of the exponent determined by the scope and scale of the required data.

  • Corollary: If the data you need would cover the world (or just poor countries), they probably don’t exist.
  • Corollary: If the data you need would extend very far back in time, they probably don’t exist.
  • Corollary: If the data you need are politically sensitive, they probably don’t exist. If they do exist, you probably can’t get them. If you can get them, you probably shouldn’t trust them.

2. However reliable you think your data are, they probably aren’t.

  • Corollary: A couple of digits after decimal point is plenty. With data this noisy, what do those thousandths really mean, anyway?

3. Just because a data transformation works doesn’t mean it’s doing what you meant it to do.

4. The only really reliable way to make sure that your analysis is replicable is to have someone previously unfamiliar with the work try to replicate it. Unfortunately, a person’s incentive to replicate someone else’s work is inversely correlated with his or her level of prior involvement in the project. Ergo, this will rarely happen until after you have posted your results.

5. If your replication materials will include random parts (e.g., sampling) and you’re using R, don’t forget to set the seed for random number generation at the start. (Alas, I am living this mistake today.)

Please use the Comments to suggest additions, corrections, or modifications.

An Experiment in Crowdsourced Coup Forecasting

Later this month, I hope to have the data I need to generate and post new statistical assessments of coup risk for 2015. Meanwhile, I thought it would be interesting and useful to experiment with applying a crowdsourcing tool to this task. So, if you think you know something about coup risk and want to help with this experiment, please cast as many votes as you like here:

2015 Coup Risk Wiki Survey

For this exercise, let’s use Naunihal Singh’s (2014, p. 51) definition of a coup attempt: “An explicit action, involving some portion of the state military, police, or security forces, undertaken with intent to overthrow the government.” As Naunihal notes,

This definition retains most of the aspects commonly found in definitions of coup attempts [Ed.: including the ones I use in my statistical modeling] while excluding a wide range of similar activities, such as conspiracies, mercenary attacks, popular protests, revolutions, civil wars, actions by lone assassins, and mutinies whose goals explicitly excluded taking power (e.g., over unpaid wages). Unlike a civil war, there is no minimum casualty threshold necessary for an event to be considered a coup, and many coups take place bloodlessly.

By this definition, last week’s putsch in the Gambia and November’s power grab by a lieutenant colonel in Burkina Faso would qualify, but last February’s change of government by parliamentary action in Ukraine after President Yanukovich’s flight in the face of popular unrest would not. Nor would state collapses in Libya and Central African Republic, which occurred under pressure from rebels rather than state security forces. And, of course, Gen. Sisi’s seizure of power in Egypt in July 2013 clearly would qualify as a successful coup on these terms.

In a guest post here yesterday, Maggie Dwyer identified one factor—divisions and tensions within the military—that probably increases coup risk in some cases, but that we can’t fold into global statistical modeling because, as often happens, we don’t have the time-series cross-sectional data we would need to do that. Surely there are other such factors and forces. My hope is that this crowdsourcing approach will help spotlight some cases overlooked by the statistical forecasts because their fragility is being driven by things those models can’t consider.

Wiki surveys weren’t designed specifically for forecasting, but I have adapted them to this purpose on two other topics, and in both cases the results have been pretty good. As part of my work for the Early Warning Project, we have run wiki surveys on risks of state-led mass killing onset for 2014 and now 2015. That project’s data-makers didn’t see any such onsets in 2014, but the two countries that came closest—Iraq and Myanmar—ranked fifth and twelfth, respectively, in the wiki survey we ran in December 2013. On pro football, I’ve run surveys ahead of the 2013 and 2014 seasons. The results haven’t been clairvoyant, but they haven’t been shabby, either (see here and here for details).

I will summarize the results of this survey on coup risk in a blog post in mid-January and will make the country– and vote-level data freely available to other researchers when I do.

I don’t necessarily plan to close the survey at that point, though. In fact, I’m really hoping to get a chance to tinker with using it more dynamically. Ideally, we would leave the survey running throughout the year so that participants could factor new information—credible rumors of an impending coup, for example, or a successful post-election transfer of power without military intervention—into their voting decisions, and the survey results would update quickly in response to those more recent votes.

Doing that would require modifying the modeling process that converts the pairwise votes into scores, however, and I’m not sure that I’m up to the task. As developed, the wiki survey effectively weights all votes the same, regardless of when they were cast. To make the survey more sensitive to fresher information, we would need to tweak that process so that recent votes are weighted more heavily—maybe with a time-decaying weighting function, or just a sliding window that closes on older votes after some point. If we wanted to get really fancy, we might find a way to use the statistical forecasts as priors in this process, too, letting the time-sensitive survey results pull cases up or push them down as the year passes.

I can imagine these modifications, but I don’t think I can code them. If you’re reading this and you might like to collaborate on that work (or fund it!) or just have thoughts on how to do it, please drop me a line at ulfelder at gmail dot com.

A Failed Coup Attempt (and Forecast) in the Gambia

This is a guest post by Maggie Dwyer (@MagDwyer). She teaches courses related to security and politics in Africa at University of Saint Andrews and University of Edinburgh.

Things are often not the way they seem in the Gambia, and this may help explain why this week’s coup attempt in Banjul was not anticipated by Jay’s statistical forecasting.

Nicknamed “the smiling coast,” the Gambia has long been known for its beach resorts, which are particularly popular with European tourists. The country has never experienced a civil war and is generally considered peaceful. Its president, Yahya Jammeh, came to power in a coup in 1994 and has won four elections since.

A deeper look at the Gambia, however, shows that the appearance of stability comes at a high cost for the population. The repressive style of the Jammeh government has led to a growing list of human rights abuses. Critics of the regime are often met with harassment, arrest, detention, and disappearance. The country is shrouded in secrecy due to a lack of press freedoms. With no presidential term limits and no significant opposition, many see no end in sight for Jammeh’s regime.

The military is often viewed as the strong arm of the Jammeh regime and is responsible for many of the abuses. Yet, the military is also kept on edge. An endless series of promotions, demotions, firings, and re-hirings leave military personnel in a constant state of uncertainty. The sense of fear within the military is exacerbated by severe punishment for those deemed disloyal. Several military members were officially executed for alleged involvement in coup plots in 2013, and there are many more suspected executions, disappearances, and torture of military personnel on which the government has never commented.

There have also been reports of growing military discontent within the military over preference for Jammeh’s minority ethnic group, the Jola. There are claims that the Jola have been given a disproportionate number of promotions, top positions and opportunities (e.g., training and participation in peacekeeping), and that this favoritism has created divisions and spurred resentment in the military.

Despite the fate of past coup plotters in the Gambia, military personnel have continued to try to oust Jammeh. He has endured at least eight alleged coup attempts during his 20 years in office. Many of the accused plotters had served at the highest military positions, including Army Chief of Staff and Director of the National Intelligence Agency, suggesting divisions at the most senior levels. It should be noted that there is speculation as to whether some of the attempts were real or simply ways to purge members of the military. The ambiguity of these events is another cause of uncertainty and fear within the military.

These tensions, divisions, and dissatisfaction within the Gambian military probably contributed to the most recent and past coup attempts against Jammeh. Unfortunately, internal military tensions are difficult to observe and quantify, especially in repressive states like the Gambia. Because these tensions are hard to quantify, they rarely factor into larger statistical forecasts, even though we have good reason to believe they contribute significantly to coup risks.

The recent coup attempt in the Gambia will switch the “domestic coup activity” variable used in Jay’s models from ‘no’ to ‘yes’ and will thereby increase its ranking in the next iteration of those assessments. The climate of fear in the Gambia will also intensify following the crackdown from this week’s coup attempt, however, and may deter copycats in the near future.

Follow

Get every new post delivered to your Inbox.

Join 10,317 other followers

%d bloggers like this: