Refugee Flows and Disorder in the Global System

This

The number of people displaced by violent conflict hit the highest level since World War II at the end of 2013, the head of the United Nations refugee agency, António Guterres, said in a report released on Friday…

Moreover, the impact of conflicts raging this year in Central African Republic, South Sudan, Ukraine and now Iraq threatens to push levels of displacement even higher by the end of 2014, he said.

…is, I think, another manifestation of the trends I discussed in a blog post here last September:

If we think on a systemic scale, it’s easier to see that we are now living through a period of global disorder matched in recent history only by the years surrounding the disintegration of the Soviet Union, and possibly exceeding it. Importantly, it’s not just the spate of state collapses through which this disorder becomes evident, but also the wider wave of protest activity and institutional transformation to which some of those collapses are connected.

If that’s true, then Mr. Guterres is probably right when he predicts that this will get even worse this year, because things still seem to be trending toward disorder. A lot of the transnational activity in response to local manifestations is still deliberately inflammatory (e.g., materiel and cash to rebels in Syria and Iraq, Russian support for separatists in Ukraine), and international efforts to quell some of those manifestations (e.g., UN PKOs in CAR and South Sudan) are struggling. Meanwhile, in what’s probably both a cause and an effect of these processes, global economic growth still has not rebounded as far or as fast as many had expected a year or two ago and remains uncertain and uneven.

In other words, the positive feedback still seems to be outrunning the negative feedback. Until that turns, the systemic processes driving (and being driven by) increased refugee flows will likely continue.

Addendum: The quote at the start of this post contains what I think is an error. A lot of the news stories on this report’s release used phrases like “displaced persons highest since World War II,” so I assumed that the U.N. report included the data on which that statement would be based. It turns out, though, that the report only makes a vague (and arguably misleading) reference to “the post-World War II era.” In fact, the U.N. does not have data to make comparisons on numbers of displaced persons prior to 1989. With the data it does have, the most the UNHCR can say is this, from p. 5: “The 2013 levels of forcible displacement were the highest since at least 1989, the first year that comprehensive statistics on global forced displacement existed.”

The picture also looks a little different from the press release if we adjust for increases in global population. Doing some rough math with the number of displaced persons in this UNHCR chart as the numerator and the U.S. Census Bureau’s mid-year estimates of world population as the denominator, here are some annual statistics on displaced persons as a share of the global population:

1989: 0.65%
1992: 0.84%
2010: 0.63%
2014: 0.72%

In no way do I mean to make light of what’s obviously a massive global problem, but as a share of the global population, the latest numbers are not (yet) even the worst since 1989, the first year for which UNHCR has comparable data.

A Brief Response to Anne-Marie Slaughter on Iraq and Syria

Anne-Marie Slaughter has an op-ed in today’s New York Times in which she argues that the U.S. government should launch air strikes now against targets in Iraq and Syria as a way to advance America’s and the world’s strategic and humanitarian interests. Here is the crux of the piece:

President Obama should be asking the same question in Iraq and Syria. What course of action will be best, in the short and the long term, for the Iraqi and Syrian people? What course of action will be most likely to stop the violence and misery they experience on a daily basis? What course of action will give them the best chance of peace, prosperity and a decent government?

The answer to those questions may well involve the use of force on a limited but immediate basis, in both countries. Enough force to remind all parties that we can, from the air, see and retaliate against not only Al Qaeda members, whom our drones track for months, but also any individuals guilty of mass atrocities and crimes against humanity. Enough force to compel governments and rebels alike to the negotiating table. And enough force to create a breathing space in which decent leaders can begin to consolidate power.

For the moment, let’s take for granted her assertions about the strategic interests at stake; the U.S.’s responsibility to protect civilians in other countries, by force if necessary; and the propriety of taking such action without prior approval from the U.N. Security Council.

Conceding all of that ground, it’s easier to see that, as a practical matter, Slaughter’s recommendation depends on strong assumptions about the efficacy of the action she proposes. Specifically, she asserts that the U.S. should conduct air strikes (“use of force on a limited but immediate basis,” “from the air”) against targets in Iraq and Syria because doing so will have three main effects:

  1. Deter atrocities (“to remind all parties that we can…see and retaliate against…any individuals guilty of mass atrocities and crimes against humanity”);
  2. Spur talks among warring parties (“to compel governments and rebels alike to the negotiating table”); and
  3. Enable positive political development (“to create a breathing space in which decent leaders can begin to consolidate power”)

If you believe, as Slaughter apparently does, that limited air strikes a) will almost certainly achieve all of these goals and b) will not produce other harmful strategic or humanitarian consequences that could partially offset or even outweigh those gains, then you should probably endorse this policy.

If, however, you are unsure about the ability of limited air strikes on yet-to-be-named targets in Iraq and Syria to accomplish these ends, or about the unintended strategic and humanitarian consequences those strikes could also have, then you should hesitate to support this policy and think through those other possible futures.

Beware the Confident Counterfactual

Did you anticipate the Syrian uprising that began in 2011? What about the Tunisian, Egyptian, and Libyan uprisings that preceded and arguably shaped it? Did you anticipate that Assad would survive the first three years of civil war there, or that Iraq’s civil war would wax again as intensely as it has in the past few days?

All of these events or outcomes were difficult forecasting problems before they occurred, and many observers have been frank about their own surprise at many of them. At the same time, many of those same observers speak with confidence about the causes of those events. The invasion of Iraq in 2003 surely is or is not the cause of the now-raging civil war in that country. The absence of direct US or NATO military intervention in Syria is or is not to blame for continuation of that country’s civil war and the mass atrocities it has brought—and, by extension, the resurgence of civil war in Iraq.

But here’s the thing: strong causal claims require some confidence about how history would have unfolded in the absence of the cause of interest, and those counterfactual histories are no easier to get right than observed history was to anticipate.

Like all of the most interesting questions, what causality means and how we might demonstrate it will forever be matters for debate—see here on Daniel Little’s blog for an overview of that debate’s recent state—but most conceptions revolve around some idea of necessity. When we say X caused Y, we usually mean that had X not occurred, Y wouldn’t have happened, either. Subtler or less stringent versions might center on salience instead of necessity and insert a “probably” into the final phrase of the previous sentence, but the core idea is the same.

In nonexperimental social science, this logic implicitly obliges us to consider the various ways history might have unfolded in response to X’ rather than X. In a sense, then, both prediction and explanation are forecasting problems. They require us to imagine states of the world we have not seen and to connect them in plausible ways to to ones we have. If anything, the counterfactual predictions required for explanation are more frustrating epistemological problems than the true forecasts, because we will never get to see the outcome(s) against which we could assess the accuracy of our guesses.

As Robert Jervis pointed out in his contribution to a 1996 edited volume on counterfactual thought experiments in world politics, counterfactuals are (or should be) especially hard to construct—and thus causal claims especially hard to make—when the causal processes of interest involve systems. For Jervis,

A system exists when elements or units are interconnected so that the system has emergent properties—i.e., its characteristics and behavior canot be inferred from the characteristics and behavior of the units taken individually—and when changes in one unit or the relationship between any two of them produce ramifying alterations in other units or relationships.

As Jervis notes,

A great deal of thinking about causation…is based on comparing two situations that are the same in all ways except one. Any differences in the outcome, whether actual or expected…can be attributed to the difference in the state of the one element…

Under many circumstances, this method is powerful and appropriate. But it runs into serious problems when we are dealing with systems because other things simply cannot be held constant: as Garret Hardin nicely puts it, in a system, ‘we can never do merely one thing.’

Jervis sketches a few thought experiments to drive this point home. He has a nice one about the effects of external interventions on civil wars that is topical here, but I think his New York traffic example is more resonant:

In everyday thought experiments we ask what would have happened if one element in our world had been different. Living in New York, I often hear people speculate that traffic would be unbearable (as opposed to merely terrible) had Robert Moses not built his highways, bridges, and tunnels. But to try to estimate what things would have been like, we cannot merely subtract these structures from today’s Manhattan landscape. The traffic patterns, the location of businesses and residences, and the number of private automobiles that are now on the streets are in significant measure the product of Moses’s road network. Had it not been built, or had it been built differently, many other things would have been different. Traffic might now be worse, but it is also possible that it would have been better because a more efficient public transportation system would have been developed or because the city would not have grown so large and prosperous without the highways.

Substitute “invade Iraq” or “fail to invade Syria” for Moses’s bridges and tunnels, and I hope you see what I mean.

In the end, it’s much harder to get beyond banal observations about influences to strong claims about causality than our story-telling minds and the popular media that cater to them would like. Of course the invasion of Iraq in 2003 or the absence of Western military intervention in Syria have shaped the histories that followed. But what would have happened in their absence—and, by implication, what would happen now if, for example, the US now re-inserted its armed forces into Iraq or attempted to topple Assad? Those questions are far tougher to answer, and we should beware of anyone who speaks with great confidence about their answers. If you’re a social scientist who isn’t comfortable making and confident in the accuracy of your predictions, you shouldn’t be comfortable making and confident in the validity of your causal claims, either.

There Is No Such Thing as Civil War

In a 2008 conference paper, Jim Fearon and David Laitin used statistics and case narratives to examine how civil wars around the world since 1955 have ended. They found that deadly fights between central governments and domestic challengers usually only end after an abrupt change in the relative fighting power of one side or the other, and that these abrupt changes are usually brought on by the beginning or end of foreign support. This pattern led them to ruminate thus (emphasis in original):

We were struck by the high frequency of militarily significant foreign support for government and rebels. The evidence suggests that more often than not, civil wars either become – or may even begin as –the object of other states’ foreign policies…Civil wars are normally studied as matters of domestic politics. Future research might make progress by shifting the perspective, and thinking about civil war as international politics by other means.

Their study recently came to mind when I was watching various people on Twitter object to the idea that what’s happening in Ukraine right now could be described as civil war, or at least the possible beginnings of one. Even if some of the separatists mobilizing in eastern Ukraine really were Ukrainian nationals, they argued, the agent provocateur was Russia, so this fight is properly understood as a foreign incursion.

As Jim and David’s paper shows, though, strong foreign hands are a common and often decisive feature of the fights we call civil wars.

In Syria, for example, numerous foreign governments and other external agents are funding, training, equipping, and arming various factions in the armed conflict that’s raged for nearly three years now. Some of that support is overt, but the support we see when we read about the war in the press is surely just a fraction of what’s actually happening. Yet we continue to see the conflict described as a civil war.

In the Central African Republic, it’s Chad that’s played “an ambiguous and powerful role” in the conflict that has precipitated state collapse and ethnic cleansing there. As the New York Times described in April,

[Chad] was accused of supporting the overthrow of the nation’s president, and then later helped remove the rebel who ousted him, making way for a new transitional government. In a statement on Thursday, the Chadian government said that its 850 soldiers had been accused of siding with Muslim militias in sectarian clashes with Christian fighters that have swept the Central African Republic for months.

At least a couple of bordering states are apparently involved in the civil war that’s stricken South Sudan since December. In a May 2014 report, the UN Mission to South Sudan asserted that government forces were receiving support from “armed groups from the Republic of Sudan,” and that “the Government has received support from the Uganda People’s Defence Force (UPDF), notably in Juba and Jonglei State.” The report also claimed that “some Darfuri militias have allied with opposition forces in the northern part of Unity State,” which borders Sudan. And, of course, there is a nearly 8,000-strong UN peacekeeping operation that is arguably shaping the scale of the violence there, even if it isn’t stopping it.

Pick a civil war—any civil war—and you’ll find similar evidence of external involvement. This is what led Jim and David to muse about civil wars as “international politics by other means,” and what led me to the deliberately provocative title of this post. As a researcher, I see analytic value in sometimes distinguishing between interstate and intrastate wars, which may have distinct causes and follow different patterns and may therefore be amenable to different forms of prevention or mitigation. At the same time, I think it’s clear that this distinction is nowhere near as crisp in reality as our labels imply, so we should be mindful to avoid confusing the typology with the reality it crudely describes.

A Useful Data Set on Political Violence that Almost No One Is Using

For the past 10 years, the CIA has overtly funded the production of a publicly available data set on certain atrocities around the world that now covers the period from January 1995 to early 2014 and is still updated on a regular basis. If you work in a relevant field but didn’t know that, you’re not alone.

The data set in question is the Political Instability Task Force’s Worldwide Atrocities Dataset, which records information from several international press sources about situations in which five or more civilians are deliberately killed in the context of some wider political conflict. Each record includes information about who did what to whom, where, and when, along with a brief text description of the event, a citation for the source article(s), and, where relevant, comments from the coder. The data are updated monthly, although those updates are posted on a four-month lag (e.g., data from January become available in May).

The decision to limit collection to events involving at least five fatalities was a pragmatic one. As the data set’s codebook notes,

We attempted at one point to lower this threshold to one and the data collection demands proved completely overwhelming, as this involved assessing every murder and ambiguous accidental death reported anywhere in the world in the international media. “Five” has no underlying theoretical justification; it merely provides a threshold above which we can confidently code all of the reported events given our available resources.

For the past three years, the data set has also fudged this rule to include targeted killings that appear to have a political motive, even when only a single victim is killed. So, for example, killings of lawyers, teachers, religious leaders, election workers, and medical personnel are nearly always recorded, and these events are distinguished from ones involving five or more victims by a “Yes” in a field identifying “Targeted Assassinations” under a “Related Tactics” header.

The data set is compiled from stories appearing in a handful of international press sources that are accessed through Factiva. It is a computer-assisted process. A Boolean keyword search is used to locate potentially relevant articles, and then human coders read those stories and make data from the ones that turn out actually to be relevant. From the beginning, the PITF data set has pulled from Reuters, Agence France Press, Associated Press, and the New York Times. Early in the process, BBC World Monitor and CNN were added to the roster, and All Africa was also added a few years ago to improve coverage of that region.

The decision to restrict collection to a relatively small number of sources was also a pragmatic one. Unlike GDELT, for example—the routine production of which is fully automated—the Atrocities Data Set is hand-coded by people reading news stories identified through a keyword search. With people doing the coding, the cost of broadening the search to local and web-based sources is prohibitive. The hope is eventually to automate the process, either as a standalone project or as part of a wider automated event data collection effort. As GDELT shows, though, that’s hard to do well, and that day hasn’t arrived yet.

Computer-assisted coding is far more labor intensive than fully automated coding, but it also carries some advantages. Human coders can still discern better than the best automated coding programs when numerous reports are all referring to the same event, so the PITF data set does a very good job eliminating duplicate records. Also, the “where” part of each record in the PITF data set includes geocoordinates, and its human coders can accurately resolve the location of nearly every event to at least the local administrative area, a task over which fully automated processes sometimes still stumble.

Of course, press reports only capture a fraction of all the atrocities that occur in most conflicts, and journalists writing about hard-to-cover conflicts often describe these situations with stories that summarize episodes of violence (e.g., “Since January, dozens of villagers have been killed…”). The PITF data set tries to accommodate this pattern by recording two distinct kinds of events: 1) incidents, which occur in a single place in short period of time, usually a single day; and 2) campaigns, which involve the same perpetrator and target group but may occur in multiple places over a longer period of time—usually days but sometimes weeks or months.

The inclusion of these campaigns alongside discrete events allows the data set to capture more information, but it also requires careful attention when using the results. Most statistical applications of data sets like this one involve cross-tabulations of events or deaths at a particular level during some period of time—say, countries and months. That’s relatively easy to do with data on discrete events located in specific places and days. Here, though, researchers have to decide ahead of time if and how they are going to blend information about the two event types. There are two basic options: 1) ignore the campaigns and focus exclusively on the incidents, treating that subset of the data set like a more traditional one and ignoring the additional information; or 2) make a convenient assumption about the distribution of the incidents of which campaigns are implicitly composed and apportion them accordingly.

For example, if we are trying to count monthly deaths from atrocities at the country level, we could assume that deaths from campaigns are distributed evenly over time and assign equal fractions of those deaths to all months over which they extend. So, a campaign in which 30 people were reportedly killed in Somalia between January and March would add 10 deaths to the monthly totals for that country in each of those three months. Alternatively, we could include all of the deaths from a campaign in the month or year in which it began. Either approach takes advantage of the additional information contained in those campaign records, but there is also a risk of double counting, as some of the events recorded as incidents might be part of the violence summarized in the campaign report.

It is also important to note that this data set does not record information about atrocities in which the United States is either the alleged perpetrator or the target (e.g., 9/11) of an atrocity because of legal restrictions on the activities of the CIA, which funds the data set’s production. This constraint presumably has a bigger impact on some cases, such as Iraq and Afghanistan, than others.

To provide a sense of what the data set contains and to make it easier for other researchers to use it, I wrote an R script that ingests and cross-tabulates the latest iteration of the data in country-month and country-year bins and then plots some of the results. That script is now posted on Github (here).

One way to see how well the data set is capturing the trends we hope it will capture is to compare the figures it produces with ones from data sets in which we already have some confidence. While I was writing this post, Colombian “data enthusiast” Miguel Olaya tweeted a pair of graphs summarizing data on massacres in that country’s long-running civil war. The data behind his graphs come from the Rutas de Conflicto project, an intensive and well-reputed effort to document as many as possible of the massacres that have occurred in Colombia since 1980. Here is a screenshot of Olaya’s graph of the annual death counts from massacres in the Rutas data set since 1995, when the PITF data pick up the story:

Annual Deaths from Massacres in Colombia by Perpetrator (Source: Rutas de Conflicta)

Annual Deaths from Massacres in Colombia by Perpetrator (Source: Rutas de Conflicta)

Now here is a graph of deaths from the incidents in the PITF data set:

deaths.yearly.colombia

Just eyeballing the two charts, the correlation looks pretty good. Both show a sharp increase in the tempo of killing in the mid-1990s; a sustained peak around 2000; a steady decline over the next several years; and a relatively low level of lethality since the mid-2000s. The annual counts from the Rutas data are two or three times larger than the ones from the PITF data during the high-intensity years, but that makes sense when we consider how much deeper of a search that project has conducted. There’s also a dip in the PITF totals in 1999 and 2000 that doesn’t appear in the Rutas data, but the comparisons over the larger span hold up. All things considered, this comparison makes the PITF data look quite good, I think.

Of course, the insurgency in Colombia has garnered better coverage from the international press than conflicts in parts of the world that are even harder to reach or less safe for correspondents than the Colombian highlands. On a couple of recent crises in exceptionally under-covered areas, the PITF data also seems to do a decent job capturing surges in violence, but only when we include campaigns as well as incidents in the counting.

The plots below show monthly death totals from a) incidents only and b) incidents and campaigns combined in the Central African Republic since 1995 and South Sudan since its independence in mid-2011. Here, deaths from campaigns have been assigned to the month in which the campaign reportedly began. In CAR, the data set identifies the upward trend in atrocities through 2013 and into 2014, but the real surge in violence that apparently began in late 2013 is only captured when we include campaigns in the cross-tabulation (the dotted line).

deaths.monthly.car

The same holds in South Sudan. There, the incident-level data available so far miss the explosion of civilian killings that began in December 2013 and reportedly continue, but the combination of campaign and incident data appears to capture a larger fraction of it, along with a notable spike in July 2013 related to clashes in Jonglei State.

deaths.monthly.southsudan

These examples suggest that the PITF Worldwide Atrocities Dataset is doing a good job at capturing trends over time in lethal violence against civilians, even in some of the hardest-to-cover cases. To my knowledge, though, this data set has not been widely used by researchers interested in atrocities or political violence more broadly. Probably its most prominent use to date was in the Model component of the Tech Challenge for Atrocities Prevention, a 2013 crowdsourced competition funded by USAID and Humanity United. That challenge produced some promising results, but it remains one of the few applications of this data set on a subject for which reliable data are scarce. Here’s hoping this post helps to rectify that.

Disclosure: I was employed by SAIC as research director of PITF from 2001 until 2011. During that time, I helped to develop the initial version of this data set and was involved in decisions to fund its continued production. Since 2011, however, I have not been involved in either the production of the data or decisions about its continued funding. I am part of a group that is trying to secure funding for a follow-on project to the Model part of the Tech Challenge for Atrocities Prevention, but that effort would not necessarily depend on this data set.

Another Note on the Limitations of Event Data

Last week, Foreign Policy ran a blog post by Kalev Leetaru that used GDELT to try to identify trends over time in protest activity around the world. That’s a fascinating and important question, but it’s also a really hard one, and I don’t think Kalev’s post succeeds in answering it. I wanted to use this space to explain why, because the issues involved are fundamental to efforts to answer many similar and important questions about patterns in human social behavior over time.

To me, the heart of Kalev’s post is his attempt to compare the intensity of protest activity worldwide over the past 35 years, the entirety of the period covered by GDELT. Ideally, we would do this with some kind of index that accounted for things like the number of protest events that occurred, the number of people who participated in them, and the things those people did.

Unfortunately, the data set that includes all of that information for all relevant events around the world doesn’t exist and never will. Although it might feel like we now live in a Panopticon, we don’t. In reality, we can still only see things that get reported in sources to which we have access; those reports aren’t always “true,” sometimes conflict, and are always incomplete; and, even in 2014, it’s still hard to reliably locate, parse, and encode data from the stories that we do see.

GDELT is the most ambitious effort to date to overcome these problems, and that ambition is helping to pull empirical social science in some new and productive directions. GDELT uses software to scour the web for media stories that contain information about a large but predetermined array of verbal and physical interactions. These interactions range from protests, threats, and attacks to more positive things like requests for aid and expressions of support. When GDELT’s software finds text that describes one of those interactions, it creates a record that includes numeric representations of words or phrases indicating what kind of interaction it was, who was involved, and where and when it took place. Each of those records becomes one tiny layer in an ever-growing stack. GDELT was only created in the 2010s, but its software has been applied to archival material to extend its coverage all the way back to 1979. The current version includes roughly 2.5 million records, and that number now grows by tens of thousands every day.

GDELT grows out of a rich tradition of event data production in social science, and its coding process mimics many of the procedures that scholars have long used to try to catalog various events of interest—or, at least, to capture reasonably representative samples of them. As such, it’s tempting to treat GDELT’s records as markers of discrete events that can be counted and cross-tabulated to identify trends over time and other patterns of interest.

That temptation should be assiduously resisted for two reasons that Leetaru and others involved in GDELT’s original creation have frequently acknowledged. First, GDELT can only create records from stories that it sees, and the volume and nature of media coverage and its digitized renderings have changed radically over the past 30 years. This change continues and may still be accelerating. One result of this change is exponential growth over time in the volume of GDELT records, as shown in the chart below (borrowed from an informative post on the Ward Lab blog). Under these circumstances, it’s unclear what comparisons across years, and especially decades, are getting at. Are we seeing meaningful changes in the phenomenon of interest, or are we really just seeing traces of change in the volume and nature of reporting on them?

Change Over Time in the Volume of GDELT Records, 1979-2011 (Source: Ward Lab)

Second, GDELT has not fully worked out how to de-duplicate its records. When the same event is reported in more than one media source, GDELT can’t always tell that they are the same event, sometimes even when it’s the same story appearing verbatim in more than one outlet. As a result, events that attract more attention are likely to generate more records. Under these circumstances, the whole idea of treating counts of records in certain categories as counts of certain event types becomes deeply problematic.

Kalev knows these things and tries to address them in his recent FP post on trends over time in protest activity. Here is how he describes what he does and the graph that results:

The number of protests each month is divided by the total number of all events recorded in GDELT that month to create a “protest intensity” score that tracks just how prevalent worldwide protest activity has been month-by-month over the last quarter-century (this corrects for the exponential rise in media coverage over the last 30 years and the imperfect nature of computer processing of the news). To make it easier to spot the macro-level patterns, a black 12-month moving average trend line is drawn on top of the graph to help clarify the major temporal shifts.

Intensity of protest activity worldwide 1979-April 2014 (black line is 12-month moving average) (Source: Kalev Leetaru via FP)

Unfortunately, I don’t think Kalev’s normalization strategy addresses either of the aforementioned problems enough to make the kind of inferences he wants to make about trends over time in the intensity of protest activity around the world.

Let’s start at the top. The numerator of Kalev’s index is the monthly count of records in a particular set of categories. This is where the lack of de-duplication can really skew the picture, and the index Kalev uses does nothing to directly address it.

Without better de-duplication, we can’t fix this problem, but we might be less worried about it if we thought that duplication were a reliable marker of event intensity. Unfortunately, it almost certainly isn’t. Certain events catch the media’s eyes for all kinds of reasons. Some are related to the nature of the event itself, but many aren’t. The things that interest us change over time, as do the ways we talk about them and the motivations of the corporations and editors who partially mediate that conversation. Under these circumstances, it would strain credulity to assume that the frequency of reports on a particular event reliably represents the intensity, or even the salience, of that event. There are just too many other possible explanations to make that inferential leap.

And there’s trouble in the bottom, too. Kalev’s decision to use the monthly volume of all records in the denominator is a reasonable one, but it doesn’t fully solve the problem it’s meant to address, either.

What we get from this division is a proportion: protest-related records as a share of all records. The problem with comparing these proportions across time slices is that they can differ for more than one reason, and that’s true even if we (heroically) assume that the lack of de-duplication isn’t a concern. A change from one month to the next might result from a change in the frequency or intensity of protest activity, but it could also result from a change in the frequency or intensity of some other event type also being tallied. Say, for example, that a war breaks out and produces a big spike in GDELT records related to violent conflict. Under these circumstances, the number of protest-related records could stay the same or even increase, and we would still see a drop in the “protest intensity score” Kalev uses.

In the end, what we get from Kalev’s index isn’t a reliable measure of the intensity of protest activity around the world and its change over time. What we get instead is a noisy measure of relative media attention to protest activity over a period of time when the nature of media attention itself has changed a great deal in ways that we still don’t fully understand. That quantity is potentially interesting in its own right. Frustratingly, though, it cannot answer seemingly simple questions like “How much protest activity are we seeing now?” or “How has the frequency or intensity of protest activity changed over the past 30 years?”

I’ll wrap this up by saying that I am still really, really excited about the new possibilities for social scientific research opening up as a result of projects like GDELT and, now, the Open Event Data Alliance it helped to spawn. At the same time, I think we social scientists have to be very cautious in our use of these shiny new things. As excited as we may be, we’re also the ones with the professional obligation to check the impulse to push them harder than they’re ready to go.

China and Russia and What Could Have Happened

Twenty five years ago, I was strolling down Leningrad’s main drag, Nevsky Prospekt, with a clutch of other American undergraduates who had recently arrived for two months of intensive language study when Professor Edna Andrews dashed up to us with the news. “They’re shooting them,” she said (or something like it—who can trust a 25-year-old memory of a speech fragment?) with obvious agitation. “They’re shooting the students in Tiananmen Square!”

Had Edna not given us that news, we probably wouldn’t have heard it, or at least not until we got home. In 1989, glasnost’ had already come to the USSR, but that didn’t mean speech was free. State newspapers were still the only ones around, at least for those of us without connections to the world of samizdat. Some of those newspapers were more informative than others, but the limits of political conversation were still clearly proscribed. The Internet didn’t exist, and international calls could only be made by appointment from state-run locations with plastic phones in cubicle-like spaces and who-knows who listening while you talked. Trustworthy information still only trickled through a public sphere mostly bifurcated between propaganda and silence.

What’s striking to me in retrospect is how differently things could have turned out in both countries. When she gave us the news about Tiananmen, Edna was surely agitated because it involved students like the ones she taught being slaughtered. I suspect she was also distressed, though, because at the time it was still easy to imagine something similar happening in the USSR, perhaps even to people she knew personally.

In 1989, politics had already started to move in the Soviet Union, but neither democratization nor disintegration was a foregone conclusion. That spring, citizens had picked delegates to the inaugural session of the Congress of People’s Deputies in elections that were, at the time, the freest the USSR had ever held. The new Congress’ sessions were shown on live television, and their content was stunning. “Deputies from around the country railed against every scandal and shortcoming of the Soviet system that could be identified,” Thomas Skallerup and James P. Nichol describe in their chapter for the Library of Congress’ Russia country study. “Speakers spared neither Gorbachev, the KGB, nor the military.”

But the outspokenness of those reformist deputies belied their formal power. More than 80 percent of the Congress’ deputies were Communist Party members, and the new legislative body the deputies elected that summer, the Supreme Soviet of the USSR, was stuffed with “old-style party apparatchiks.” Two years later, reactionaries inside the government mounted a coup attempt in which President Gorbachev was arrested and detained for a few days and tanks were deployed on the streets of Moscow.

Tank near Red Square on 19 August 1991. © Anatoly Sapronyenkov/AFP/Getty Images

That August Putsch looks a bit clowny with hindsight, but it didn’t have to fail. Likewise, the brutal suppression of China’s 1989 uprising didn’t have to happen, or to succeed when it did. In a story published this week in the New York Times, Andrew Jacobs and Chris Buckley describe the uncertainty of Chinese policy toward the uprising and the disunity of the armed forces tasked with executing it—and, eventually, the protesters in Tiananmen Square.

“At the time,” Jacobs and Buckley write, “few in the military wanted to take direct responsibility for the decision to fire on civilians. Even as troops pressed into Beijing, they were given vague, confusing instructions about what to do, and some commanders sought reassurances that they would not be required to shoot.” Seven senior commanders signed a petition calling on political leaders to withdraw the troops. Those leaders responded by disconnecting many of the special phones those commanders used to communicate with each other. When troops were finally given orders to retake the square “at any cost,” some commanders ignored them. At least one pretended that his battalion’s radio had malfunctioned.

As Erica Chenoweth and Maria Stephan show in their study of civil resistance, nonviolent uprisings are much more likely to succeed when they prompt defections by security forces. The Tiananmen uprising was crushed, but history could have slipped in many other directions. And it still can.

Conflict Events, Coup Forecasts, and Data Prospecting

Last week, for an upcoming post to the interim blog of the atrocities early-warning project I direct, I got to digging around in ACLED’s conflict event data for the first time. Once I had the data processed, I started wondering if they might help improve forecasts of coup attempts, too. That train of thought led to the preliminary results I’ll describe here, and to a general reminder of the often-frustrating nature of applied statistical forecasting.

ACLED is the Armed Conflict Location & Event Data Project, a U.S. Department of Defense–funded, multi-year endeavor to capture information about instances of political violence in sub-Saharan Africa from 1997 to the present.ACLED’s coders scan an array of print and broadcast sources, identifiy relevant events from them, and then record those events’ date, location, and form (battle, violence against civilians, or riots/protests); the types of actors involved; whether or not territory changed hands; and the number of fatalities that occurred. Researchers can download all of the project’s data in various formats and structures from the Data page, one of the better ones I’ve seen in political science.

I came to ACLED last week because I wanted to see if violence against civilians in Somalia had waxed, waned, or held steady in recent months. Trying to answer that question with their data meant:

  • Downloading two Excel spreadsheets, Version 4 of the data for 1997-2013 and the Realtime Data file covering (so far) the first five months of this year;
  • Processing and merging those two files, which took a little work because my software had trouble reading the original spreadsheets and the labels and formats differed a bit across them; and
  • Subsetting and summarizing the data on violence against civilians in Somalia, which also took some care because there was an extra space at the end of the relevant label in some of the records.

Once I had done these things, it was easy to generalize it to the entire data set, producing tables with monthly counts of fatalities and events by type  for all African countries over the past 13 years. And, once I had those country-month counts of conflict events, it was easy to imagine using them to try to help forecast of coup attempts in the world’s most coup-prone region. Other things being equal, variations across countries and over time in the frequency of conflict events might tell us a little more about the state of politics in those countries, and therefore where and when coup attempts are more likely to happen.

Well, in this case, it turns out they don’t tell us much more. The plot below shows ROC curves and the areas under those curves for the out-of-sample predictions from a five-fold cross-validation exercise involving a few country-month models of coup attempts. The Base Model includes: national political regime type (the categorization scheme from PITF’s global instability model applied to Polity 3d, the spell-file version); time since last change in Polity score (in days, logged); infant mortality rate (relative to the annual global median, logged); and an indicator for any coup attempts in the previous 24 months (yes/no). The three other models add logged sums of counts of ACLED events by type—battles, violence against civilians, or riots/protests—in the same country over the previous three, six, or 12 months, respectively. These are all logistic regression models, and the dependent variable is a binary one indicating whether or not any coup attempts (successful or failed) occurred in that country during that month, according to Powell and Thyne.

ROC Curves and AUC Scores from Five-Fold Cross-Validation of Coup Models Without and With ACLED Event Counts

ROC Curves and AUC Scores from Five-Fold Cross-Validation of Coup Models Without and With ACLED Event Counts

As the chart shows, adding the conflict event counts to the base model seems to buy us a smidgen more discriminatory power, but not enough to have confidence that they would routinely lead to more accurate forecasts. Intriguingly, the crossing of the ROC curves suggests that the base model, which emphasizes structural conditions, is actually a little better at identifying the most coup-prone countries. The addition of conflict event counts to the model leads to some under-prediction of coups in that high-risk set, but the balance tips the other way in countries with less structural vulnerability. In the aggregate, though, there is virtually no difference in discriminatory power between the base model and the ones that at the conflict event counts.

There are, of course, many other ways to group and slice ACLED’s data, but the rarity of coups leads me to believe that narrower cuts or alternative operationalizations aren’t likely to produce stronger predictive signals. In Africa since 1997, there are only 36 country-months with coup attempts, according to Powell and Thyne. When the events are this rare and complex and the examples this few, there’s really not much point in going beyond the most direct measures. Under these circumstances, we’re unlikely to discover finer patterns, and if we do, we probably shouldn’t have much confidence in them. There are also other models and techniques to try, but I’m dubious for the same reasons. (FWIW, I did try Random Forests and got virtually identical accuracy.)

So those are the preliminary results from this specific exercise. (The R scripts I used are on Github, here). I think those results are interesting in their own right, but the process involved in getting to them is also a great example of the often-frustrating nature of applied statistical forecasting. I spent a few hours each day for three days straight getting from the thought of exploring ACLED to the results described here. Nearly all of that time was spent processing data; only the last half-hour or so involved any modeling. As is often the case, a lot of that data-processing time was really just me staring at my monitor trying to think of another way to solve some problem I’d already tried and failed to solve.

In my experience, that kind of null result is where nearly all statistical forecasting ideas end. Even when you’re lucky enough to have the data to pursue them, few of your ideas pan out. But panning is the right metaphor, I think. Most of the work is repetitive and frustrating, but every so often you catch a nice nugget. Those nuggets tempt you to keep looking for more, and once in a great while, they can make you rich.

Introducing A New Venue for Atrocities Early Warning

Starting today, the bits of this blog on forecasting and monitoring mass atrocities are moving to their proper home, or at least the initial makings of it. Say hi to the (interim) blog of the Early Warning Project.

Since 2012, I have been working as a consultant to the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide (CPG) to help build a new global early-warning system for mass atrocities. As usual, that process is taking longer than we had expected. We now have working versions of the project’s two main forecasting streams—statistical risk assessments and a “wisdom of (expert) crowds” system called an opinion pool—and CPG has hired a full-time staffer (hi, Ali) to manage their day-to-day workings. Unfortunately, though, the web site that will present, discuss, and invite discussion of those forecasts is still under construction. Thanks to Dartmouth’s DALI Lab, we’ve got a great prototype, but there’s finishing work to be done, and doing it takes a while.

Well, delays, be damned. We think the content we’re producing is useful now, so we’re not waiting for that site to get finished to start sharing it. Instead, we’re launching this interim blog to go ahead and start doing things like:

When the project’s full-blown web site finally goes up, it will feature a blog, too, and all of the content from this interim venue will migrate there. Until then, if you’re interested in atrocities early warning and prevention—or applied forecasting more generally—please come see what we’re doing, share what you find interesting, and help us think about how to do it even better.

Meanwhile, Dart-Throwing Chimp will keep plugging along on its core themes of democratization, political instability, and forecasting. If you’ve got the interest and the bandwidth, I hope you’ll find time to watch and engage with both channels.

Turkey Regresses Toward the Mean

Like many Turkey watchers, Erik Meyersson and Dani Rodrik argue in the latest Foreign Affairs that Turkey is no longer a democracy. In contrast to many Turkey watchers, they argue that this slide began early in the now-eleven-year rule of Recep Tayyip Erdogan and his Justice and Development Party (AKP) and has continued apace ever since.

Turkey’s institutional deterioration is not a recent matter. It started long before Erdogan’s manifestly heavy-handed and polarizing responses to the Gezi protests of the summer of 2013 and to the corruption probe in winter 2013. The harsh crackdown on the media over the last year is but the latest phase in an ongoing process of repression of independent press. And Erdogan and the Gülenists have long manipulated the judiciary, using it to harass and jail opponents on charges ranging from the flimsy to the fabricated.

If this is correct—and I believe it is—then Turkey has essentially regressed toward the mean. Most attempts at democracy fail, and in the past 20 years, most of those failures have come in the form of consolidations of incumbent advantage. An authoritarian regime breaks down; competitive elections are held; a party wins those elections; and, finally, that party uses its incumbency to retool the machinery of the state in ways that ensure it stays in power.

Consolidations of incumbent advantage are common, in part, because most political organizations covet power, especially once they attain it. Even when those organizations don’t covet power, though, uncertainty about the willingness of their political rivals and the military to abide by democratic rules gives ruling parties added incentive to tighten their grip on government as a way to avoid their worst-case scenarios involving the re-establishment of authoritarian rule under someone else.

In my book on dilemmas of democratic consolidation, written about five years ago, I used Turkey under the AKP as a example of how, counterintuitively, these pressures could sometimes counterbalance each other and actually help democracy persist. In the Turkish case, it was the military’s traditional role as the guarantor of secular republicanism and final arbiter of political disputes that seemed to be checking democracy’s normal tendencies toward consolidation of incumbent advantage. The threat of a military coup was in a kind of sweet spot: it was still real enough to deter the AKP from trying nakedly to impose authoritarian rule, but it was no longer so strong that AKP would feel compelled to act aggressively in order to protect against its least-preferred outcome.

Apparently, that’s changed. Over the past decade, the risk of a military coup has declined enough that AKP no longer regards it as a credible threat. Of course, AKP helped bring about this shift, and thus the consolidation of its own power, with its dogged prosecution of the the alleged Ergenekon coup plot. As Erik Meyersson pointed out in an email to me, AKP’s sheer electoral power surely helped to deter military intervention as well. Had the military usurped power from Erdogan and his colleagues, the ensuing social and economic upheaval would likely have rendered the coup a poisoned chalice. Ironically, Turkey’s membership in NATO may have played a role, too, by helping to socialize Turkish officers against direct intervention in politics.

Whatever the precise and ultimately unknowable causes of this regression are, the status that still seemed fuzzy to me a year ago is now clear. Turkey has joined the ranks of the world’s electoral authoritarian regimes, full stop. In so doing, it has followed the modal path of attempts at democracy in the post–Cold War period, giving us another reminder that “normal” isn’t necessarily better.

Follow

Get every new post delivered to your Inbox.

Join 6,445 other followers

%d bloggers like this: