The Rwanda Enigma

For analysts and advocates trying to assess risks of future mass atrocities in hopes of preventing them, Rwanda presents an unusual puzzle. Most of the time, specialists in this field readily agree on which countries are especially susceptible to genocide or mass killing, either because those countries are either already experiencing large-scale civil conflict or because they are widely considered susceptible to it. Meanwhile, countries that sustain long episodes of peace and steadily grow their economies are generally presumed to have reduced their risk and eventually to have escaped this trap for good.

Contemporary Rwanda is puzzling because it provokes a polarized reaction. Many observers laud Rwanda as one of Africa’s greatest developmental successes, but others warn that it remains dangerously prone to mass atrocities. In a recent essay for African Arguments on how the Rwandan genocide changed the world, Omar McDoom nicely encapsulates this unusual duality:

What has changed inside Rwanda itself since the genocide? The country has enjoyed a remarkable period of social stability. There has not been a serious incident of ethnic violence in Rwanda for nearly two decades. Donors have praised the country’s astonishing development.  Economic growth has averaged over 6% per year, poverty and inequality have declined, child and maternal mortality have improved, and primary education is now universal and free. Rwanda has shown, in defiance of expectations, that an African state can deliver security, public services, and rising prosperity.

Yet, politically, there is some troubling continuity with pre-genocide Rwanda. Power remains concentrated in the hands of a small, powerful ethnic elite led by a charismatic individual with authoritarian tendencies. In form, current president Paul Kagame and his ruling party, the RPF, the heroes who ended the genocide, appear to exercise power in a manner similar to former president Juvenal Habyarimana and his ruling MRND party, the actors closely-tied to those who planned the slaughter. The genocide is testament to what unconstrained power over Rwanda’s unusually efficient state machinery can enable.

That duality also emerges from a comparison of two recent quantitative rankings. On the one hand, The World Bank now ranks Rwanda 32nd on the latest edition of its “ease of doing business” index—not 32nd in Africa, but 32nd of 189 countries worldwide. On the other hand, statistical assessments of the risk of an onset of state-led mass killing identify Rwanda as one of the 25 countries worldwide currently most vulnerable to this kind of catastrophe.

How can both of these things be true? To answer that question, we need to have a clearer sense of where that statistical risk assessment comes from. The number that ranks Rwanda among the 25 countries most susceptible to state-led mass killing is actually an average of forecasts from three models representing a few different ideas about the origins of mass atrocities, all applied to publicly available data from widely used sources.

  • Drawing on work by Barbara Harff and the Political Instability Task Force, the first model emphasizes features of countries’ national politics that hint at a predilection to commit genocide or “politicide,” especially in the context of political instability. Key risk factors in Harff’s model include authoritarian rule, the political salience of elite ethnicity, evidence of an exclusionary elite ideology, and international isolation as measured by trade openness.
  • The second model takes a more instrumental view of mass killing. It uses statistical forecasts of future coup attempts and new civil wars as proxy measures of things that could either spur incumbent rulers to lash out against threats to their power or usher in an insecure new regime that might do the same.
  • The third model is really not a model but a machine-learning process called Random Forests applied to the risk factors identified by the other two. The resulting algorithm is an amalgamation of theory and induction that takes experts’ beliefs about the origins of mass killing as its jumping-off point but also leaves more room for inductive discovery of contingent effects.

All of these models are estimated from historical data that compares cases where state-led mass killings occurred to ones where they didn’t. In essence, we look to the past to identify patterns that will help us spot cases at high risk of mass killing now and in the future. To get our single-best risk assessment—the number that puts Rwanda in the top (or bottom) 25 worldwide—we simply average the forecasts from these three models. We prefer the average to a single model’s output because we know from work in many fields—including meteorology and elections forecasting—that this “ensemble” approach generally produces more accurate assessments than we could expect to get from any one model alone. By combining forecasts, we learn from all three perspectives and hedge against the biases of any one of them.

Rwanda lands in the top 25 worldwide because all three models identify it as a relatively high-risk case. It ranks 15th on the PITF/Harff model, 28th on the “elite threat” model, and 30th on the Random Forest. The PITF/Harff model sees a relatively low risk in Rwanda of the kinds of political instability that typically trigger onsets of genocide or politicide, but it also pegs Rwanda as the kind of regime most likely to resort to mass atrocities if instability were to occur—namely, an autocracy in which elites’ ethnicity is politically salient in a country with a recent history of genocide. Rwanda also scores fairly high on the “elite threat” model because, according to our models of these things, it is at relatively high risk of a new insurgency and moderate risk of a coup attempt. Finally, the Random Forest sees a very low probability of mass killing onset in Rwanda but still pegs it as a riskier case than most.

Our identification of Rwanda as a relatively high-risk case is echoed by some, but not all, of the other occasional global assessments of countries’ susceptibility to mass atrocities. In her own applications of her genocide/politicide model for the task of early warning, Barbara Harff pegged Rwanda as one of the world’s riskiest cases in 2011 but not in 2013. Similarly, the last update of Genocide Watch’s Countries at Risk Report, in 2012, lists Rwanda as one of more than a dozen countries at stage five of seven on the path to genocide, putting it among the 35 countries worldwide at greatest risk. By contrast, the Global Centre for the Responsibility to Protect has not identified Rwanda as a situation of concern in any of its R2P Monitor reports to date, and the Sentinel Project for Genocide Prevention does not list Rwanda among its situations of concern, either. Meanwhile, recent reporting on Rwanda from Human Rights Watch has focused mostly on the pursuit of justice for the 1994 genocide and other kinds of human-rights violations in contemporary Rwanda.

To see what our own pool of experts makes of our statistical risk assessment and to track changes in their views over time, we plan to add a question to our “wisdom of (expert) crowds” forecasting system asking about the prospect of a new state-led mass killing in Rwanda before 2015. If one does not happen, as we hope and expect will be the case, we plan to re-launch the question at the start of next year and will continue to do so as long as our statistical models keep identifying it as a case of concern.

In the meantime, I thought it would be useful to ask a few country experts what they make of this assessment and how a return to mass killing in Rwanda might come about. Some were reluctant to speak on the record, and understandably so. The present government of Rwanda has a history of intimidating individuals it perceives as its critics. As Michaela Wrong describes in a recent piece for Foreign Policy,

A U.S. State Department spokesperson said in mid-January, “We are troubled by the succession of what appear to be politically motivated murders of prominent Rwandan exiles. President Kagame’s recent statements about, quote, ‘consequences’ for those who betray Rwanda are of deep concern to us.”

It is a pattern that suggests the Rwandan government may have come to see the violent silencing of critics—irrespective of geographical location and host country—as a beleaguered country’s prerogative.

Despite these constraints, the impression I get from talking to some experts and reading the work of others is that our risk assessment strikes nearly all of them as plausible. None said that he or she expects an episode of state-led mass killing to begin soon in Rwanda. Consistent with the thinking behind our statistical models, though, many seem to believe that another mass killing could occur in Rwanda, and if one did, it would almost certainly come in reaction to some other rupture in that country’s political stability.

Filip Reyntjens, a professor at the University of Antwerpen who wrote a book on Rwandan politics since the 1994 genocide, was both the most forthright and the most pessimistic in his assessment. Via email, he described Rwanda as

A volcano waiting to erupt. Nearly all field research during the last 15 years points at pervasive structural violence that may, as we know, become physical, acute violence following a trigger. I don’t know what that trigger will be, but I think a palace revolution or a coup d’etat is the most likely scenario. That may create a situation difficult to control.

In a recent essay for Juncture that was adapted for the Huffington Post (here), Phil Clark sounds more optimistic than Reyntjens, but he is not entirely sanguine, either. Clark sees the structure and culture of the country’s ruling party, the Rwandan Patriotic Front (RPF), as the seminal feature of Rwandan politics since the genocide and describes it as a double-edged sword. On the one hand, the RPF’s cohesiveness and dedication to purpose has enabled it, with help from an international community with a guilty conscience, to make “enormous” developmental gains. On the other hand,

The RPF’s desire for internal cohesion has made it suspicious of critical voices within and outside of the party—a feature compounded by Rwanda’s fraught experience of multi-party democracy in the early 1990s, which saw the rise of ethnically driven extremist parties and helped to create an environment conducive to genocide. The RPF’s singular focus on rebuilding the nation and facilitating the return of refugees means it has often viewed dissent as an unaffordable distraction. The disastrous dalliance with multipartyism before the genocide has only added to the deep suspicion of policy based on the open contestation of ideas.

Looking ahead, Clark wonders what happens when that intolerance for dissent bumps up against popular frustrations, as it probably will at some point:

For the moment, there are few signs of large-scale popular discontent with the closed political space. However, any substantial decline in socio-economic conditions in the countryside will challenge this. The RPF’s gamble appears to be that the population will tolerate a lack of national political contestation provided domestic stability and basic living standards are maintained. For now, the RPF seems to have rightly judged the popular mood but that situation may not hold.

Journalist Kris Berwouts portrays similarly ambiguous terrain in a recent piece for the Dutch magazine Mo that also appeared on the blog African Arguments (here). Berwouts quotes David Himbara, a former Rwandan regime insider who left the country in 2010 and has vocally criticized the Kagame government ever since, as telling him that “all society has vanished from Rwanda, mistrust is complete. It has turned Rwanda into a time bomb.” But Berwouts juxtaposes that dire assessment with the cautiously optimistic view of Belgian journalist Marc Hoogsteyns, who has worked in the region for years and has family ties by marriage to its Tutsi community. According to Hoogsteyns,

Rwanda is a beautiful country with many strengths and opportunities, but at the same time it is some kind of African version of Brave New World. People are afraid to talk. But they live more comfortably and safely than ever before, they enjoy high quality education and health care. They are very happy with that. The Tutsi community stands almost entirely behind Kagame and also most Hutu can live with it. They obviously don’t like the fact that they do not count on the political scene, but they can do what they want in all other spheres of live. They can study and do business etcetera. They can deal with the level of repression, because they know that countries such as Burundi, Congo or Kenya are not the slightest bit more democratic. Honestly, if we would have known twenty years ago, just after the genocide, that Rwanda would achieve this in two decades, we would have signed for it immediately.

As people of a certain age in places like Sarajevo or Bamako might testify, though, stability is a funny thing. It’s there until it isn’t, and when it goes, it sometimes goes quickly. In this sense, the political crises that sometimes produce mass killings are more like earthquakes than elections. We can spot the vulnerable structures fairly accurately, but we’re still not very good at anticipating the timing and dynamics of ruptures in them.

In the spirit of that last point, it’s important to acknowledge that the statistical assessment of Rwanda’s risk to mass killing is a blunt piece of information. Although it does specifically indicate a susceptibility to atrocities perpetrated by state security forces or groups acting at their behest, it does not necessarily implicate the RPF as the likely perpetrators. The qualitative assessments discussed above suggest that some experts find that scenario plausible, but it isn’t the only one consistent with our statistical finding. A new regime brought to power by coup or revolution could also become the agent of a new wave of mass atrocities in Rwanda, and the statistical forecast would be just as accurate.

Egypt’s recent past offers a case in point. Our statistical assessments of susceptibility to state-led mass killing in early 2013 identified Egypt as a relatively high-risk case, like Rwanda now. At the time, Mohammed Morsi was president, and one plausible interpretation of that risk assessment might have centered on the threat the Muslim Brotherhood’s supporters posed to Egypt’s Coptic Christians. Fast forward to July 2013, and the mass killing we ended up seeing in Egypt came at the hands of an army and police who snatched power away from Morsi and the Brotherhood and then proceeded to kill hundreds of their unarmed sympathizers. That outcome doesn’t imply that Coptic Christians weren’t at grave risk before the coup, but it should remind us to consider a variety of ways these systemic risks might become manifest.

Still, after conversations with a convenience sample of regional experts, I am left with the impression that the risk our statistical models identify of a new state-led mass killing in Rwanda is real, and that it is possible to imagine the ruling RPF as the agents of such violence.

No one seems to expect the regime to engage in mass violence without provocation, but the possibility of a new Hutu insurgency, and the state’s likely reaction to it, emerged from those conversations as perhaps the most likely scenario. According to some of the experts with whom I spoke, many Rwandan Hutus are growing increasingly frustrated with the RPF regime, and some radical elements of the Hutu diaspora appear to be looking for ways to take up that mantle. The presence of an insurgency is the single most-powerful predictor of state-led mass killing, and it does not seem far fetched to imagine the RPF regime using “scorched earth” tactics in response to the threat or occurrence of attacks on its soldiers and Tutsi citizens. After all, this is the same regime whose soldiers pursued Hutu refugees into Zaire in the mid-1990s and, according to a 2010 U.N. report, participated in the killings of tens of thousands of civilians in war crimes that were arguably genocidal.

Last but not least, we can observe that Rwanda has suffered episodes of mass killing roughly once per generation since independence—in the early 1960s, in 1974, and again in the early 1990s, culminating in the genocide of 1994 and the reprisal killings that followed. History certainly isn’t destiny, but our statistical models confirm that in the case of mass atrocities, it often rhymes.

It saddens me to write this piece about a country that just marked the twentieth anniversary of one of the most lethal genocides since the Holocaust, but the point of our statistical modeling is to see what the data say that our mental models and emotional assessments might overlook. A reprisal of mass killing in Rwanda would be horribly tragic. As Free Africa Foundation president George Ayittey wrote in a recent letter of the Wall Street Journal, however, “The real tragedy of Rwanda is that Mr. Kagame is so consumed by the 1994 genocide that, in his attempt to prevent another one, he is creating the very conditions that led to it.”

Whither Organized Violence?

The Human Security Research Group has just published the latest in its series of now-annual reports on “trends in organized violence around the world,” and it’s essential reading for anyone deeply interested in armed conflict and other forms of political violence. You can find the PDF here.

The 2013 edition takes Steven Pinker’s Better Angels as its muse and largely concurs with Pinker’s conclusions. I’ll sheepishly admit that I haven’t read Pinker’s book (yet), so I’m not going to engage directly in that debate. Instead, I’ll call attention to what the report’s authors infer from their research about future trends in political violence. Here’s how that bit starts, on p. 18:

The most encouraging data from the modern era come from the post–World War II years. This period includes the dramatic decline in the number and deadliness of international wars since the end of World War II and the reversal of the decades-long increase in civil war numbers that followed the end of the Cold War in the early 1990s.

What are the chances that these positive changes will be sustained? No one really knows. There are too many future unknowns to make predictions with any degree of confidence.

On that point, political scientist Bear Braumoeller would agree. In an interview last year for Popular Science (here), Kelsey Atherton asked Braumoeller about Braumoeller’s assertion in a recent paper (here) that it will take 150 years to know if the downward trend in warfare that Pinker and others have identified is holding. Braumoeller replied:

Some of this literature points to “the long peace” of post-World War II. Obviously we haven’t stopped fighting wars entirely, so what they’re referring to is the absence of really really big wars like World War I and World War II. Those wars would have to be absent for like 70 to 75 more years for us to have confidence that there’s been a change in the baseline rate of really really big wars.

That’s sort of a separate question from how we know whether there are trends in warfare in general. We need to understand that war and peace are both stochastic processes. We need a big enough sample to rule out the historical average, which is about one or two big wars per century. We just haven’t had enough time since World War I and World War II to rule out the possibility that nothing’s changed.

I suspect that the authors of the Human Security Report would not dispute that claim, but after carefully reviewing Pinker’s and their own evidence, they do see causes for cautious optimism. Here I’ll quote at length, because I think it’s important to see the full array of forces taken into consideration to increase our confidence in the validity of the authors’ cautious speculations.

The case for pessimism about the global security future is well rehearsed and has considerable support within the research community. Major sources of concern include the possibility of outbreaks of nuclear terrorism, a massive transnational upsurge of lethal Islamist radicalism, or wars triggered by mass droughts and population movements driven by climate change.

Pinker notes reasons for concern about each of these potential future threats but also skepticism about the more extreme claims of the conflict pessimists. Other possible drivers of global violence include the political crises that could follow the collapse of the international financial system and destabilizing shifts in the global balance of economic and military power—the latter being a major concern of realist scholars worried about the economic and military rise of China.

But focusing exclusively on factors and processes that may increase the risks of large-scale violence around the world, while ignoring those that decrease it, also almost certainly leads to unduly pessimistic conclusions.

In the current era, factors and processes that reduce the risks of violence not only include the enduring impact of the long-term trends identified in Better Angels but also the disappearance of two major drivers of warfare in the post–World War II period—colonialism and the Cold War. Other post–World War II changes that have reduced the risks of war include the entrenchment of the global norm against interstate warfare except in self-defence or with the authority of the UN Security Council; the intensification of economic and financial interdependence that increases the costs and decreases the benefits of cross-border warfare; the spread of stable democracies; and the caution-inducing impact of nuclear weapons on relations between the major powers.

With respect to civil wars, the emergent and still-growing system of global security governance discussed in Chapter 1 has clearly helped reduce the number of intrastate conflicts since the end of the Cold War. And, at what might be called the “structural” level, we have witnessed steady increases in national incomes across the developing world. This is important because one of the strongest findings from econometric research on the causes of war is that the risk of civil wars declines as national incomes—and hence governance and other capacities—increase. Chapter 1 reports on a remarkable recent statistical study by the Peace Research Institute, Oslo (PRIO) that found that if current trends in key structural variables are sustained, the proportion of the world’s countries afflicted by civil wars will halve by 2050.

Such an outcome is far from certain, of course, and for reasons that have yet to be imagined, as well as those canvassed by the conflict pessimists. But, thanks in substantial part to Steven Pinker’s extraordinary research, there are now compelling reasons for believing that the historical decline in violence is both real and remarkably large—and also that the future may well be less violent than the past.

After reading the new Human Security Report, I remain a short-term pessimist and long-term optimist. As I’ve said in a few recent posts (see especially this one), I think we’re currently in the thick of period of systemic instability that will continue to produce mass protests, state collapse, mass killing, and other forms of political instability at higher rates than we’ve seen since the early 1990s for at least the next year or two.

At the same time, I don’t think this local upswing marks a deeper reversal of the long-term trend that Pinker identifies, and that the Human Security Report confirms. Instead, I believe that the global political economy is continuing to evolve in a direction that makes political violence less common and less lethal. This system creep is evident not only in the aforementioned trends in armed violence, but also in concurrent and presumably interconnected trends in democratization, socio-economic development, and global governance. Until we see significant and sustained reversals in most or all of these trends, I will remain optimistic about the directionality of the underlying processes of which these data can give us only glimpses.

States Aren’t the Only Mass Killers

We tend to think of mass killing as something that states do, but states do not have a monopoly on this use of force. Many groups employ violence in an attempt to further their political and economic agendas; civilians often suffer the consequences of that violence, and sometimes that suffering reaches breathtaking scale.

This point occurred to me again as I thought about the stunning acts of mass violence that Boko Haram has carried out in northern Nigeria in the past few weeks. The chart below comes from the Council on Foreign Relations’ Nigeria Security Tracker, an online interface for a data set that counts deaths from “violent incidents directed at government property, places of worship, and suicide bombings.” The sharp upward bend at the far right of that red line represents the sudden and brutal end of several hundred lives in the past two months in various towns and villages in a part of the world that surely isn’t as alien to Americans as many of us assume. In Nigeria, too, parents wake up and set about the business of providing for themselves and their families, and many kids toddle off to school to learn and fidget and chatter with friends. Over the past few years, Boko Haram has repeatedly interrupted those daily routines with scores of attacks resulting in thousands of murders.

boko.haram.killings.chart.20140307

I suspect the tendency to see mass killing as the purview of states is driven by the extraordinary salience of two archetypal cases—the Holocaust, of course, but also the Rwandan genocide. From those examples, we infer that violence on this scale requires resources, organization, and opportunity on a scale that in “modern” times only states are supposed to possess. The Holocaust took this bureaucratic logic to unique extremes, but many accounts of the Rwandan genocide also emphasize state planning and propaganda as necessary conditions for that episode of mass murder in extremis.

It’s true that resources, organization, and opportunity facilitate mass violence, and that states are much more likely to have them. In some contexts, though, rebel groups and other non-state actors can accumulate enough resources and become well enough organized to kill on a comparable scale. This is especially likely in the same contexts in which states usually perpetrate mass killing, namely, in civil wars. In some wars, rebels manage to establish governance systems of their own, and the apparent logic of the atrocities committed by these quasi-states looks very similar to the logic behind the atrocities perpetrated by their foes: destroy your rival’s base of support, and scare civilians into compliance or complicity.

Rebels don’t need to govern to carry out mass killings, though, a point driven home by groups like the RUF in Sierra Leone, the Seleka and anti-balaka militias in the Central African Republic, and, of course, Boko Haram. Sometimes the states we now expect to protect civilians against such violence are so weak or absent or uncaring that those non-state groups don’t need deep pockets and sprawling organizations to accomplish mass murder. On Boko Haram, CFR’s John Campbell observes that, “Several of the most recent incidents involve government security forces unaccountably not at their posts, allowing Boko Haram freedom of movement. The governor of Borno state publicly said that Boko Haram fighters outgun government forces.” Campbell also notes that those security forces might be shirking their duty because they are poorly paid and equipped, and because they simply fear a group that “has a long tradition of killing any person in the security services that it can.” With a state like that, the resources and organization required to accomplish mass murder are, unfortunately, not so vast. What is required is a degree of ruthlessness that most of us find hard to understand, but that incomprehensibility should not be confused with impossibility.

Acts we conventionally describe as “terrorism” nowadays are also atrocities by another name, and so-called terrorist groups occasionally succeed in their lethal business on an extraordinary scale. Al Qaeda’s attacks on September 11, 2001, certainly qualify as a mass killing as we conventionally define it. Nearly 3,000 noncombatant civilians from a discrete group (Americans) were deliberately killed as part of a wider political conflict, and all in a single day. The torrent of car bombings and other indiscriminate attacks in Iraq in recent months has surely crossed that arbitrary 1,000-death threshold by now, too.

For analytical purposes, it would be useful to have a catalog of episodes in which non-state organizations committed atrocities on such a large scale. That catalog would allow us to try to glean patterns and develop predictive models from their comparison to each other and, more important, to situations in which those episodes did not occur. Even more useful would be a reliable assemblage of data on the incidents comprising those episodes, so we could carefully study how and where they arise and accumulate over time, perhaps with some hope of halting or at least mitigating future episodes as they develop.

Unfortunately, the data we want usually aren’t the data we have, and that’s true here, too. The Uppsala Conflict Data Program (UCDP) has compiled a data set on “one-sided violence,” defined as “intentional attacks on civilians by governments and formally organized armed groups,” that includes low, high, and best estimates of deaths attributed to each perpetrator group in cases where that annual estimate is 25 deaths or more (here). These data are an excellent start, but they only cover years since 1989, so the number of episodes involving non-state groups as perpetrators is still very small. The Armed Conflict Location & Event Data Project (ACLED) compiles detailed data (here) on attacks by non-state groups, among others, but it only covers Africa since 1997. New developments in the automated production of political event data hint at the possibility of analyzing deliberate violence against civilians around the world at a much higher resolution in the not-too-distant future. As I’ve discovered in an ongoing efforts to adapt one of these data sets to this purpose, however, we’re not quite there yet (see here).

In the meantime, we’ll keep seeing accounts of murderous sprees by groups like Boko Haram (here and here, to pick just two) and CAR’s Seleka (here) and anti-balaka (here) alongside the thrum of reporting on atrocities from places like Syria and Sudan. And as we read, we would do well to remember that people, not states, are the the common denominator.

PS. In the discussion of relevant data sets, I somehow forgot to mention that the Political Instability Task Force also funds the continuing collection of data on “atrocities” around the world involving five or more civilian fatalities (here). These data, which run all the way back to January 1995, are carefully compiled under the direction of a master of the craft, but they also suffer from the inevitable problems of reporting bias that plague all such efforts and so must be handled with care (see Will Moore here and here on this subject).

Watch Experts’ Beliefs Evolve Over Time

On 15 December 2013, “something” happened in South Sudan that quickly began to spiral into a wider conflict. Prior research tells us that mass killings often occur on the heels of coup attempts and during civil wars, and at the time South Sudan ranked among the world’s countries at greatest risk of state-led mass killing.

Motivated by these two facts, I promptly added a question about South Sudan to the opinion pool we’re running as part of a new atrocities early-warning system for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide (see this recent post for more on that). As it happened, we already had one question running about the possibility of a state-led mass killing in South Sudan targeting the Murle, but the spiraling conflict clearly implied a host of other risks. Posted on 18 December 2013, the new question asked, “Before 1 January 2015, will an episode of mass killing occur in South Sudan?”

The criteria we gave our forecasters to understand what we mean by “mass killing” and how we would decide if one has happened appear under the Background Information header at the bottom of this post. Now, shown below is an animated sequence of kernel density plots of each day’s forecasts from all participants who’d chosen to answer this question. A kernel density plot is like a histogram, but with some nonparametric estimation thrown in to try to get at the distribution of a variable’s “true” values from the sample of observations we’ve got. If that sound like gibberish to you, just think of the peaks in the plots as clumps of experts who share similar beliefs about the likelihood of mass killing in South Sudan. The taller the peak, the bigger the clump. The farther right the peak, the more likely that clump thinks a mass killing is.

kplot.ssd.20140205

I see a couple of interesting patterns in those plots. The first is the rapid rightward shift in the distribution’s center of gravity. As the fighting escalated and reports of atrocities began to trickle in (see here for one much-discussed article from the time), many of our forecasters quickly became convinced that a mass killing would occur in South Sudan in the coming year, if one wasn’t occurring already. On 23 December—the date that aforementioned article appeared—the average forecast jumped to approximately 80 percent, and it hasn’t fallen below that level since.

The second pattern that catches my eye is the appearance in January of a long, thin tail in the distribution that reaches into the lower ranges. That shift in the shape of the distribution coincides with stepped-up efforts by U.N. peacekeepers to stem the fighting and the start of direct talks between the warring parties. I can’t say for sure what motivated that shift, but it looks like our forecasters split in their response to those developments. While most remained convinced that a mass killing would occur or had already, a few forecasters were apparently more optimistic about the ability of those peacekeepers or talks or both to avert a full-blown mass killing. A few weeks later, it’s still not clear which view is correct, although a forthcoming report from the U.N. Mission in South Sudan may soon shed more light on this question.

I think this set of plots is interesting on its face for what it tells us about the urgent risk of mass atrocities in South Sudan. At the same time, I also hope this exercise demonstrates the potential to extract useful information from an opinion pool beyond a point-estimate forecast. We know from prior and ongoing research that those point estimates can be quite informative in their own right. Still, by looking at the distribution of participant’s forecasts on a particular question, we can glean something about the degree of uncertainty around an event of interest or concern. By looking for changes in that distribution over time, we can also get a more complete picture of how the group’s beliefs evolve in response to new information than a simple line plot of the average forecast could ever tell us. Look for more of this work as our early-warning system comes online, hopefully in the next few months.

UPDATE (7 Feb): At the urging of Trey Causey, I tried making another version of this animation in which the area under the density plot is filled in. I also decided to add a vertical line to show each day’s average forecast, which is what we currently report as the single-best forecast at any given time. Here’s what that looks like, using data from a question on the risk of a mass killing occurring in the Central African Republic before 2015. We closed this question on 19 December 2013, when it became clear through reporting by Human Rights Watch and others that an episode of mass killing has occurred.

kplot2.car.20140207

Background Information

We will consider a mass killing to have occurred when the deliberate actions of state security forces or other armed groups result in the deaths of at least 1,000 noncombatant civilians over a period of one year or less.

  • A noncombatant civilian is any person who is not a current member of a formal or irregular military organization and who does not apparently pose an immediate threat to the life, physical safety, or property of other people.
  • The reference to deliberate actions distinguishes mass killing from deaths caused by natural disasters, infectious diseases, the accidental killing of civilians during war, or the unanticipated consequences of other government policies. Fatalities should be considered intentional if they result from actions designed to compel or coerce civilian populations to change their behavior against their will, as long as the perpetrators could have reasonably expected that these actions would result in widespread death among the affected populations. Note that this definition also covers deaths caused by other state actions, if, in our judgment, perpetrators enacted policies/actions designed to coerce civilian population and could have expected that these policies/actions would lead to large numbers of civilian fatalities. Examples of such actions include, but are not limited to: mass starvation or disease-related deaths resulting from the intentional confiscation, destruction, or medicines or other healthcare supplies; and deaths occurring during forced relocation or forced labor.
  • To distinguish mass killing from large numbers of unrelated civilian fatalities, the victims of mass killing must appear to be perceived by the perpetrators as belonging to a discrete group. That group may be defined communally (e.g., ethnic or religious), politically (e.g., partisan or ideological), socio-economically (e.g., class or professional), or geographically (e.g., residents of specific villages or regions). In this way, apparently unrelated executions by police or other state agents would not qualify as mass killing, but capital punishment directed against members of a specific political or communal group would.

The determination of whether or not a mass killing has occurred will be made by the administrators of this system using publicly available secondary sources and in consultation with subject-matter experts. Relevant evidence will be summarized in a blog post published when the determination is announced, and any dissenting views will be discussed as well.

Will Unarmed Civilians Soon Get Massacred in Ukraine?

According to one pool of forecasters, most probably not.

As part of a public atrocities early-warning system I am currently helping to build for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide (see here), we are running a kind of always-on forecasting survey called an opinion pool. An opinion pool is similar in spirit to a prediction market, but instead of having participants trade shares tied the occurrence of some future event, we simply ask participants to estimate the probability of each event’s occurrence. In contrast to a traditional survey, every question remains open until the event occurs or the forecasting window closes. This way, participants can update their forecasts as often as they like, as they see or hear relevant information or just change their minds.

With generous support from Inkling, we started up our opinion pool in October, aiming to test and refine it before our larger early-warning system makes its public debut this spring (we hope). So far, we have only recruited opportunistically among colleagues and professional acquaintances, but we already have more than 70 registered participants. In the first four months of operation, we have used the system to ask more than two dozen questions, two of which have since closed because the relevant events occurred (mass killing in CAR and the Geneva II talks on Syria).

Over the next few years, we aim to recruit a large and diverse pool of volunteer forecasters from around the world with some claim to topical expertise or relevant local knowledge. The larger and more diverse our pool, the more accurate we expect our forecasts to be, and the wider the array of questions we can ask. (If you are interested in participating, please drop me a line at ulfelder <at> gmail <dot> com.)

A few days ago and prompted by a couple of our more active members, I posted a question to our pool asking, “Before 1 March 2014, will any massacres occur in Ukraine?” As of this morning, our pool had made a total of 13 forecasts, and the unweighted average of the latest of those estimates from each participating forecaster was just 15 percent. Under the criteria we specified (see Background Information below), this forecast does not address the risk of large-scale violence against or among armed civilians, nor does it exclude the possibility of a series of small but violent encounters that cumulatively produce a comparable or larger death toll. Still, for those of us concerned that security forces or militias will soon kill nonviolent protesters in Ukraine on a large scale, our initial forecast implies that those fears are probably unwarranted.

Crowd-Estimated Probability of Any Massacres in Ukraine Before 1 March 2014

Crowd-Estimated Probability of Any Massacres in Ukraine Before 1 March 2014

Obviously, we don’t have a crystal ball, and this is just an aggregation of subjective estimates from a small pool of people, none of whom (I think) is on the scene in Ukraine or has inside knowledge of the decision-making of relevant groups. Still, a growing body of evidence shows that aggregations of subjective forecasts like this one can often be usefully accurate (see here), even with a small number of contributing forecasters (see here). On this particular question, I very much hope our crowd is right. Whatever happens in Ukraine over the next few weeks, though, principle and evidence suggest that the method is sound, and we soon expect to be using this system to help assess risks of mass atrocities all over the world in real time.

Background Information

We define a “massacre” as an event that has the following features:

  • At least 10 noncombatant civilians are killed in one location (e.g., neighborhood, town, or village) in less than 48 hours. A noncombatant civilian is any person who is not a current member of a formal or irregular military organization and who does not apparently pose an immediate threat to the life, physical safety, or property of other people.
  • The victims appear to have been the primary target of the violence that killed them.
  • The victims do not appear to have been engaged in violent action or criminal activity when they were killed, unless that violent action was apparently in self-defense.
  • The relevant killings were carried out by individuals affiliated with a social group or organization engaged in a wider political conflict and appear to be connected to each other and to that wider conflict.

Those features will not always be self-evident or uncontroversial, so we use the following series of ad hoc rules to make more consistent judgments about ambiguous events.

  • Police, soldiers, prison guards, and other agents of state security are never considered noncombatant civilians, even if they are killed while off duty or out of uniform.
  • State officials and bureaucrats are not considered civilians when they are apparently targeted because of their professional status (e.g., assassinated).
  • Civilian deaths that occur in the context of operations by uniformed military-service members against enemy combatants are considered collateral damage, not atrocities, and should be excluded unless there is strong evidence that the civilians were targeted deliberately. We will err on the side of assuming that they were not.
  • Deaths from state repression of civilians engaged in nonviolent forms of protest are considered atrocities. Deaths resulting from state repression targeting civilians who were clearly engaged in rioting, looting, attacks on property, or other forms of collective aggression or violence are not.
  • Non-state militant or paramilitary groups, such as militias, gangs, vigilante groups, or raiding parties, are considered combatants, not civilians.

We will use contextual knowledge to determine whether or not a discrete event is linked to a wider conflict or campaign of violence, and we will err on the side of assuming that it is.

Determinations of whether or not a massacre has occurred will be made by the administrator of this system using publicly available secondary sources. Relevant evidence will be summarized in a blog post published when the determination is announced, and any dissenting views will be discussed as well.

Disclosure

I have argued on this blog that scholars have an obligation to disclose potential conflicts of interest when discussing their research, so let me do that again here: For the past two years, I have been paid as a contractor by the U.S. Holocaust Memorial Museum for my work on the atrocities early-warning system discussed in this post. Since the spring of 2013, I have also been paid to write questions for the Good Judgment Project, in which I participated as a forecaster the year before. To the best of my knowledge, I have no financial interests in, and have never received any payments from, any companies that commercially operate prediction markets or opinion pools.

What the U.S. Intelligence Community Says About Mass Atrocities in 2014

Here’s what Director of National Intelligence James Clapper said about the risk of mass atrocities this year in the Worldwide Threat Assessment he delivered today to the Senate Select Committee on Intelligence:

The overall risk of mass atrocities worldwide will probably increase in 2014 and beyond. Trends driving this increase include more social mobilization, violent conflict, including communal violence, and other forms of instability that spill over borders and exacerbate ethnic and religious tensions; diminished or stagnant quality of governance; and widespread impunity for past abuses. Many countries at risk of mass atrocities will likely be open to influence to prevent or mitigate them. This is because they are dependent on Western assistance or multilateral missions in their countries, have the political will to prevent mass atrocities, or would be responsive to international scrutiny. Overall international will and capability to prevent or mitigate mass atrocities will likely diminish in 2014 and beyond, although support for human rights norms to prevent atrocities will almost certainly deepen among some non-government organizations. Much of the world will almost certainly turn to the United States for leadership to prevent and respond to mass atrocities.

That’s a ton of analysis crammed into a single paragraph, and I suspect a lot of person-hours went into the construction of those six sentences.

However many hours it was, I think the results are largely correct. After two decades of relative quiescence, we’ve seen a troubling rebound in the occurrence of mass atrocities in the past few years, and the systemic forces that seem to be driving that rebound don’t yet show signs of abating.

One point on which I disagree with the IC’s analysis, though, is the claim that “widespread impunity for past abuses” is helping to fuel the upward trend in mass atrocities. I don’t think this assertion is flat-out false; I just think it’s overblown and over-confident. As Mark Kersten argued last week in a blog post on the debate over whether or not the situation in Syria should be referred to the International Criminal Court (ICC),

Any suggestion that international criminal justice should be pursued in the context of ongoing hostilities in Syria leads us to the familiar “peace versus justice” debate. Within this debate, there are broadly two camps: one which views international criminal justice as a necessary and useful tool which can deter crimes, marginalize perpetrators and even be conducive to peace negotiations; and a second camp which sees judicial interventions as deleterious to peace talks and claims that it creates disincentives for warring parties to negotiate and leads to increased levels of violence.

So who’s right? I think Kersten is when he says this:

It remains too rarely conceded that the Courts effects are mixed and, even more rarely, that they might be negligible.This points to the ongoing need to reimagine how we study and assess the effects of the ICC on ongoing and active conflicts. There is little doubt that the Court can have negative and positive effects on the ability of warring parties and interested actors to transform conflicts and establish peace. But this shouldn’t lead to a belief that the ICC must have these effects across cases. In some instances, the Court may actually have minimal or even inconsequential effects. As importantly, in many if not most cases, the ICC won’t be the be-all and end-all of peace processes. Even when the Court has palpable effects, peace processes aren’t likely to flourish or perish on the hill of international criminal justice.

Finally, I’m not sure what the Threat Assessment‘s drafters had in mind when they wrote that “overall international will and capability to prevent or mitigate mass atrocities will likely diminish in 2014 and beyond.” I suspect that statement is a nod in the direction of declinists who worry that a recalcitrant Russia and rising China spell trouble for the supposed Pax Americana, but that’s just a guess.

In any case, I think the assertion is wrong. Syria is the horror that seems to lurk behind this point, and there’s no question that the escalation and spread of that war represents one of the greatest failures of global governance in modern times. Even as the war in Syria continues, though, international forces have mobilized to stem fighting in the Central African Republic and South Sudan, two conflicts that are already terrible but could also get much, much worse. Although the long-term effects of those mobilizations remain unclear, the very fact of their occurrence undercuts the claim that international will and capability to respond to mass atrocities are flagging.

A New Statistical Approach to Assessing Risks of State-Led Mass Killing

Which countries around the world are currently at greatest risk of an onset of state-led mass killing? At the start of the year, I posted results from a wiki survey that asked this question. Now, here in heat-map form are the latest results from a rejiggered statistical process with the same target. You can find a dot plot of these data at the bottom of the post, and the data and code used to generate them are on GitHub.

Estimated Risk of New Episode of State-Led Mass Killing

These assessments represent the unweighted average of probabilistic forecasts from three separate models trained on country-year data covering the period 1960-2011. In all three models, the outcome of interest is the onset of an episode of state-led mass killing, defined as any episode in which the deliberate actions of state agents or other organizations kill at least 1,000 noncombatant civilians from a discrete group. The three models are:

  • PITF/Harff. A logistic regression model approximating the structural model of genocide/politicide risk developed by Barbara Harff for the Political Instability Task Force (PITF). In its published form, the Harff model only applies to countries already experiencing civil war or adverse regime change and produces a single estimate of the risk of a genocide or politicide occurring at some time during that crisis. To build a version of the model that was more dynamic, I constructed an approximation of the PITF’s global model for forecasting political instability and use the natural log of the predicted probabilities it produces as an additional input to the Harff model. This approach mimics the one used by Harff and Ted Gurr in their ongoing application of the genocide/politicide model for risk assessment (see here).
  • Elite Threat. A logistic regression model that uses the natural log of predicted probabilities from two other logistic regression models—one of civil-war onset, the other of coup attempts—as its only inputs. This model is meant to represent the argument put forth by Matt Krain, Ben Valentino, and others that states usually engage in mass killing in response to threats to ruling elites’ hold on power.
  • Random Forest. A machine-learning technique (see here) applied to all of the variables used in the two previous models, plus a few others of possible relevance, using the ‘randomforest‘ package in R. A couple of parameters were tuned on the basis of a gridded comparison of forecast accuracy in 10-fold cross-validation.

The Random Forest proved to be the most accurate of the three models in stratified 10-fold cross-validation. The chart below is a kernel density plot of the areas under the ROC curve for the out-of-sample estimates from that cross-validation drill. As the chart shows, the average AUC for the Random Forest was in the low 0.80s, compared with the high 0.70s for the PITF/Harff and Elite Threat models. As expected, the average of the forecasts from all three performed even better than the best single model, albeit not by much. These out-of-sample accuracy rates aren’t mind blowing, but they aren’t bad either, and they are as good or better than many of the ones I’ve seen from similar efforts to anticipate the onset of rare political crises in countries worldwide.

cpg.statrisk2014.val.auc.by.fold

Distribution of Out-of-Sample AUC Scores by Model in 10-Fold Cross-Validation

The decision to use an unweighted average for the combined forecast might seem simplistic, but it’s actually a principled choice in this instance. When examples of the event of interest are hard to come by and we have reason to believe that the process generating those events may be changing over time, sticking with an unweighted average is a reasonable hedge against risks of over-fitting the ensemble to the idiosyncrasies of the test set used to tune it. For a longer discussion of this point, see pp. 7-8 in the last paper I wrote on this work and the paper by Andreas Graefe referenced therein.

Any close readers of my previous work on this topic over the past couple of years (see here and here) will notice that one model has been dropped from the last version of this ensemble, namely, the one proposed by Michael Colaresi and Sabine Carey in their 2008 article, “To Kill or To Protect” (here). As I was reworking my scripts to make regular updating easier (more on that below), I paid closer attention than I had before to the fact that the Colaresi and Carey model requires a measure of the size of state security forces that is missing for many country-years. In previous iterations, I had worked around that problem by using a categorical version of this variable that treated missingness as a separate category, but this time I noticed that there were fewer than 20 mass-killing onsets in country-years for which I had a valid observation of security-force size. With so few examples, we’re not going to get reliable estimates of any pattern connecting the two. As it happened, this model—which, to be fair to its authors, was not designed to be used as a forecasting device—was also by far the least accurate of the lot in 10-fold cross-validation. Putting two and two together, I decided to consign this one to the scrap heap for now. I still believe that measures of military forces could help us assess risks of mass killing, but we’re going to need more and better data to incorporate that idea into our multimodel ensemble.

The bigger and in some ways more novel change from previous iterations of this work concerns the unorthodox approach I’m now using to make the risk assessments as current as possible. All of the models used to generate these assessments were trained on country-year data, because that’s the only form in which most of the requisite data is produced. To mimic the eventual forecasting process, the inputs to those models are all lagged one year at the model-estimation stage—so, for example, data on risk factors from 1985 are compared with outcomes in 1986, 1986 inputs to 1987 outcomes, and so on.

If we stick rigidly to that structure at the forecasting stage, then I need data from 2013 to produce 2014 forecasts. Unfortunately, many of the sources for the measures used in these models won’t publish their 2013 data for at least a few more months. Faced with this problem, I could do something like what I aim to do with the coup forecasts I’ll be producing in the next few days—that is, only use data from sources that quickly and reliably update soon after the start of each year. Unfortunately again, though, the only way to do that would be to omit many of the variables most specific to the risk of mass atrocities—things like the occurrence of violent civil conflict or the political salience of elite ethnicity.

So now I’m trying something different. Instead of waiting until every last input has been updated for the previous year and they all neatly align in my rectangular data set, I am simply applying my algorithms to the most recent available observation of each input. It took some trial and error to write, but I now have an R script that automates this process at the country level by pulling the time series for each variable, omitting the missing values, reversing the series order, snipping off the observation at the start of that string, collecting those snippets in a new vector, and running that vector through the previously estimated model objects to get a forecast (see the section of this starting at line 284).

One implicit goal of this approach is to make it easier to jump to batch processing, where the forecasting engine routinely and automatically pings the data sources online and updates whenever any of the requisite inputs has changed. So, for example, when in a few months the vaunted Polity IV Project releases its 2013 update, my forecasting contraption would catch and ingest the new version and the forecasts would change accordingly. I now have scripts that can do the statistical part but am going to be leaning on other folks to automate the wider routine as part of the early-warning system I’m helping build for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide.

The big upside of this opportunistic approach to updating is that the risk assessments are always as current as possible, conditional on the limitations of the available data. The way I figure, when you don’t have information that’s as fresh as you’d like, use the freshest information you’ve got.

The downside of this approach is that it’s not clear exactly what the outputs from that process represent. Technically, a forecast is a probabilistic statement about the likelihood of a specific event during a specific time period. The outputs from this process are still probabilistic statements about the likelihood of a specific event, but they are no longer anchored to a specific time period. The probabilities mapped at the top of this post mostly use data from 2012, but the inputs for some variables for some cases are a little older, while the inputs for some of the dynamic variables (e.g., GDP growth rates and coup attempts) are essentially current. So are those outputs forecasts for 2013, or for 2014, or something else?

For now, I’m going with “something else” and am thinking of the outputs from this machinery as the most up-to-date statistical risk assessments I can produce, but not forecasts as such. That description will probably sound like fudging to most statisticians, but it’s meant to be an honest reflection of both the strengths and limitations of the underlying approach.

Any gear heads who’ve read this far, I’d really appreciate hearing your thoughts on this strategy and any ideas you might have on other ways to resolve this conundrum, or any other aspect of this forecasting process. As noted at the top, the data and code used to produce these estimates are posted online. This work is part of a soon-to-launch, public early-warning system, so we hope and expect that they will have some effect on policy and advocacy planning processes. Given that aim, it behooves us to do whatever we can to make them as accurate as possible, so I would very much welcome any suggestions on how to do or describe this better.

Finally and as promised, here is a dot plot of the estimates mapped above. Countries are shown in descending order by estimated risk. The gray dots mark the forecasts from the three component models, and the red dot marks the unweighted average.

dotplot.20140122

PS. In preparation for a presentation on this work at an upcoming workshop, I made a new map of the current assessments that works better, I think, than the one at the top of this post. Instead of coloring by quintiles, this new version (below) groups cases into several bins that roughly represent doublings of risk: less than 1%, 1-2%, 2-4%, 4-8%, and 8-16%. This version more accurately shows that the vast majority of countries are at extremely low risk and more clearly shows variations in risk among the ones that are not.

Estimated Risk of New State-Led Mass Killing

Estimated Risk of New State-Led Mass Killing

A Coda to “Using GDELT to Monitor Atrocities, Take 2″

I love doing research in the Internet Age. As I’d hoped it would, my post yesterday on the latest iteration of our atrocities-monitoring system in the works has already sparked a lot of really helpful responses. Some of those responses are captured in comments on the post, but not all of them are. So, partly as a public good and partly for my own record-keeping, I thought I’d write a coda to that post enumerating the leads it generated and some of my reactions to them.

Give the Machines Another Shot at It

As a way to reduce or even eliminate the burden placed on our human(s) in the loop, several people suggested something we’ve been considering for a while: use machine-learning techniques to develop classifiers that can be used to further reduce the data left after our first round of filtering. These classifiers could consider all of the features in GDELT, not just the event and actor types we’re using in our R script now. If we’re feeling really ambitious, we could go all the way back to the source stories and use natural-language processing to look for additional discriminatory power there. This second round might not eliminate the need for human review, but it certainly could lighten the load.

The comment threads on this topic (here and here) nicely capture what I see as the promise and likely limitations of this strategy, so I won’t belabor it here. For now, I’ll just note that how well this would work is an empirical question, and it’s one we hope to get a chance to answer once we’ve accumulated enough screened data to give those classifiers a fighting chance.

Leverage GDELT’s Global Knowledge Graph

Related to the first idea, GDELT co-creator Kalev Leetaru has suggested on a couple of occasions that we think about ways to bring the recently-created GDELT Global Knowledge Graph (GKG) to bear on our filtering task. As Kalev describes in a post on the GDELT blog, GKG consists of two data streams, one that records mentions of various counts and another that captures connections  in each day’s news between “persons, organizations, locations, emotions, themes, counts, events, and sources.” That second stream in particular includes a bunch of data points that we can connect to specific event records and thus use as additional features in the kind of classifiers described under the previous header. In response to my post, Kalev sent this email to me and a few colleagues:

I ran some very very quick numbers on the human coding results Jay sent me where a human coded 922 articles covering 9 days of GDELT events and coded 26 of them as atrocities. Of course, 26 records isn’t enough to get any kind of statistical latch onto to build a training model, but the spectral response of the various GKG themes is quite informative. For events tagged as being an atrocity, themes such as ETHNICITY, RELIGION, HUMAN_RIGHTS, and a variety of functional actors like Villagers, Doctors, Prophets, Activists, show up in the top themes, whereas in the non-atrocities the roles are primarily political leaders, military personnel, authorities, etc. As just a simple example, the HUMAN_RIGHTS theme appeared in just 6% of non-atrocities, but 30% of atrocities, while Activists show up in 33% of atrocities compared with just 4% of non-atrocities, and the list goes on.

Again, 26 articles isn’t enough to build a model on, but just glancing over the breakdown of the GKG themes for the two there is a really strong and clear breakage between the two across the entire set of themes, and the breakdown fits precisely what baysean classifiers like (they are the most accurate for this kind of separation task and outperform SVM and random forest).

So, Jay, the bottom line is that if you can start recording each day the list of articles that you guys review and the ones you flag as an atrocity and give me a nice dataset over time, should be pretty easy to dramatically filter these down for you at the very least.

As I’ve said throughout this process, its not that event data can’t do what is needed, its that often you have to bring additional signals into the mix to accomplish your goals when the thing you’re after requires signals beyond what the event records are capturing.

What Kalev suggests at the end there—keep a record of all the events we review and the decisions we make on them—is what we’re doing now, and I hope we can expand on his experiment in the next several months.

Crowdsource It

Jim Walsh left a thoughtful comment suggesting that we crowdsource the human coding:

Seems to me like a lot of people might be willing to volunteer their time for this important issue–human rights activists and NGO types, area experts, professors and their students (who might even get some credit and learn about coding). If you had a large enough cadre of volunteers, could assign many (10 or more?) to each day’s data and generate some sort of average or modal response. Would need someone to organize the volunteers, and I’m not sure how this would be implemented online, but might be do-able.

As I said in my reply to him, this is an approach we’ve considered but rejected for now. We’re eager to take advantage of the wisdom of interested crowds and are already doing so in big ways on other parts of our early-warning system, but I have two major concerns about how well it would work for this particular task.

The first is the recruiting problem, and here I see a Catch-22: people are less inclined to do this if they don’t believe the system works, but it’s hard to convince them that the system works if we don’t already have a crowd involved to make it go. This recruiting problem becomes especially acute in a system with time-sensitive deliverables. If we promise daily updates, we need to produce daily updates, and it’s hard to do that reliably if we depend on self-organized labor.

My second concern is the principal-agent problem. Our goal is to make reliable and valid data in a timely way, but there are surely people out there who would bring goals to the process that might not align with ours. Imagine, for example, that Absurdistan appears in the filtered-but-not-yet-coded data to be committing atrocities, but citizens (or even paid agents) of Absurdistan don’t like that idea and so organize to vote those events out of the data set. It’s possible that our project would be too far under the radar for anyone to bother, but our ambitions are larger than that, so we don’t want to assume that will be true. If we succeed at attracting the kind of attention we hope to attract, the deeply political and often controversial nature of our subject matter would make crowdsourcing this task more vulnerable to this kind of failure.

Use Mechanical Turk

Both of the concerns I have about the downsides of crowdsourcing the human-coding stage could be addressed by Ryan Briggs’ suggestion via Twitter to have Amazon Mechanical Turk do it. A hired crowd is there when you need it and (usually) doesn’t bring political agendas to the task. It’s also relatively cheap, and you only pay for work performed.

Thanks to our collaboration with Dartmouth’s Dickey Center, the marginal cost of the human coding isn’t huge, so it’s not clear that Mechanical Turk would offer much advantage on that front. Where it could really help is in routinizing the daily updates. As I mentioned in the initial post, when you depend on human action and have just one or a few people involved, it’s hard to establish a set of routines that covers weekends and college breaks and sick days and is robust to periodic changes in personnel. Primarily for this reason, I hope we’ll be able to run an experiment with Mechanical Turk where we can compare its cost and output to what we’re paying and getting now and see if this strategy might make sense for us.

Don’t Forget About Errors of Omission

Last but not least, a longtime colleague had this to say in an email reacting to the post (hyperlinks added):

You are effectively describing a method for reducing errors of commission, events coded by GDELT as atrocities that, upon closer inspection, should not be. It seems like you also need to examine errors of omission. This is obviously harder. Two possible opportunities would be to compare to either [the PITF Worldwide Atrocities Event Data Set] or to ACLED.  There are two questions. Is GDELT “seeing” the same source info (and my guess is that it is and more, though ACLED covers more than just English sources and I’m not sure where GDELT stands on other languages). Then if so (and there are errors of omission) why aren’t they showing up (coded as different types of events or failed to trigger any coding at all)[?]

It’s true that our efforts so far have focused almost exclusively on avoiding errors of commission, with the important caveat that it’s really our automated filtering process, not GDELT, that commits most of these errors. The basic problem for us is that GDELT, or really the CAMEO scheme on which it’s based, wasn’t designed to spot atrocities per se. As a result, most of what we filter out in our human-coding second stage aren’t things that were miscoded by GDELT. Instead, they’re things that were properly coded by GDELT as various forms of violent action but upon closer inspection don’t appear to involve the additional features of atrocities as we define them.

Of course, that still leaves us with this colleague’s central concern about errors of omission, and on that he’s absolutely right. I have experimented with different actor and event-type criteria to make sure we’re not missing a lot of events of interest in GDELT, but I haven’t yet compared what we’re finding in GDELT to what related databases that use different sources are seeing. Once we accumulate a few month’s worth of data, I think this is something we’re really going to need to do.

Stay tuned for Take 3…

Using GDELT to Monitor Atrocities, Take 2

Last May, I wrote a post about my preliminary efforts to use a new data set called GDELT to monitor reporting on atrocities around the world in near-real time. Those efforts represent one part of the work I’m doing on a public early-warning system for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide, and they have continued in fits and starts over the ensuing eight months. With help from Dartmouth’s Dickey Center, Palantir, and the GDELT crew, we’ve made a lot of progress. I thought I’d post an update now because I’m excited about the headway we’ve made; I think others might benefit from seeing what we’re doing; and I hope this transparency can help us figure out how to do this task even better.

So, let’s cut to the chase: Here is a screenshot of an interactive map locating the nine events captured in GDELT in the first week of January 2014 that looked like atrocities to us and occurred in a place that the Google Maps API recognized when queried. (One event was left off the map because Google Maps didn’t recognize its reported location.) The size of the bubbles corresponds to the number of civilian deaths, which in this map range from one to 31. To really get a feel for what we’re trying to do, though, head over to the original visualization on CartoDB (here), where you can zoom in and out and click on the bubbles to see a hyperlink to the story from which each event was identified.

atrocities.monitoring.screenshot.20140113

Looks simple, right? Well, it turns out it isn’t, not by a long shot.

As this blog’s regular readers know, GDELT uses software to scour the web for new stories about political interactions all around the world and parses those stories to identify and record information about who did or said what to whom, when, and where. It currently covers the period 1979-present and is now updated every day, and each of those daily updates contains some 100,000-140,000 new records. Miraculously and crucial to a non-profit pilot project like ours, GDELT is also available for free. 

The nine events plotted in the map above were sifted from the tens of thousands of records GDELT dumped on us in the first week of 2014. Unfortunately, that data-reduction process is only partially automated.

The first step in that process is the quickest. As originally envisioned back in May, we are using an R script (here) to download GDELT’s daily update file and sift it for events that look, from the event type and actors involved, like they might involve what we consider to be an atrocity—that is, deliberate, deadly violence against one or more noncombatant civilians in the context of a wider political conflict.

Unfortunately, the stack of records that filtering script returns—something like 100-200 records per day—still includes a lot of stuff that doesn’t interest us. Some records are properly coded but involve actions that don’t meet our definition of an atrocity (e.g., clashes between rioters and police or rebels and troops); some involve atrocities but are duplicates of events we’ve already captured; and some are just miscoded (e.g., a mention of the film industry “shooting” movies that gets coded as soldiers shooting civilians).

After we saw how noisy our data set would be if we stopped screening there, we experimented with a monitoring system that would acknowledge GDELT’s imperfections and try to work with them. As Phil Schrodt recommended at the recent GDELT DC Hackathon, we looked to “embrace the suck.” Instead of trying to use GDELT to generate a reliable chronicle of atrocities around the world, we would watch for interesting and potentially relevant perturbations in the information stream, noise and all, and those perturbations would produce alerts that users of our system could choose to investigate further. Working with Palantir, we built a system that would estimate country-specific prior moving averages of daily event counts returned by our filtering script and would generate an alert whenever a country’s new daily count landed more than two standard deviations above or below that average.

That system sounded great to most of the data pros in our figurative room, but it turned out to be a non-starter with some other constituencies of importance to us. The issue was credibility. Some of the events causing those perturbations in the GDELT stream were exactly what we were looking for, but others—a pod of beached whales in Brazil, or Congress killing a bill on healthcare reform—were laughably far from the mark. If our supposedly high-tech system confused beached whales and Congressional procedures for mass atrocities, we would risk undercutting the reputation for reliability and technical acumen that we are striving to achieve.

So, back to the drawing board we went. To separate the signal from the static and arrive at something more like that valid chronicle we’d originally envisioned, we decided that we needed to add a second, more laborious step to our data-reduction process. After our R script had done its work, we would review each of the remaining records by hand to decide if it belonged in our data set or not and, when necessary, to correct any fields that appeared to have been miscoded. While we were at it, we would also record the number of deaths each event produced. We wrote a set of rules to guide those decisions; had two people (a Dartmouth undergraduate research assistant and I) apply those rules to the same sets of daily files; and compared notes and made fixes. After a few iterations of that process over a few months, we arrived at the codebook we’re using now (here).

This process radically reduces the amount of data involved. Each of those two steps drops us down multiple orders of magnitude: from 100,000-140,000 records in the daily updates, to about 150 in our auto-filtered set, to just one or two in our hand-filtered set. The figure below illustrates the extent of that reduction. In effect, we’re treating GDELT as a very powerful but error-prone search and coding tool, a source of raw ore that needs refining to become the thing we’re after. This isn’t the only way to use GDELT, of course, but for our monitoring task as presently conceived, it’s the one that we think will work best.

monitoring.data.reduction.graphic

Once that second data-reduction step is done, we still have a few tasks left to enable the kind of mapping and analysis we aim to do. We want to trim the data set to keep only the atrocities we’ve identified, and we need to consolidate the original and corrected fields in those remaining records and geolocate them. All of that work gets done with a second R script (here), which is applied to the spreadsheet the coder saves after completing her work. The much smaller file that script produces is then ready to upload to a repository where it can be combined with other days’ outputs to produce the global chronicle our monitoring project aims to produce.

From start to finish, each daily update now takes about 45 minutes, give or take 15. We’d like to shrink that further if we can but don’t see any real opportunities to do so at the moment. Perhaps more important, we still have to figure out the bureaucratic procedures that will allow us to squeeze daily updates from a “human in the loop” process in a world where there are weekends and holidays and people get sick and take vacations and sometimes even quit. Finally, we also have not yet built the dashboard that will display and summarize and provide access to these data on our program’s web site, which we expect to launch some time this spring.

We know that the data set this process produces will be incomplete. I am 100-percent certain that during the first week of January 2014, more than 10 events occurred around the world that met our definition of an atrocity. Unfortunately, we can only find things where GDELT looks, and even a scan of every news story produced every day everywhere in the world would fail to see the many atrocities that never make the news.

On the whole, though, I’m excited about the progress we’ve made. As soon as we can launch it, this monitoring process should help advocates and analysts more efficiently track atrocities globally in close to real time. As our data set grows, we also hope it will serve as the foundation for new research on forecasting, explaining, and preventing this kind of violence. Even with its evident shortcomings, we believe this data set will prove to be useful, and as GDELT’s reach continues to expand, so will ours.

PS For a coda discussing the great ideas people had in response to this post, go here.

[Erratum: The original version of this post said there were about 10,000 records in each daily update from GDELT. The actual figure is 100,000-140,000. The error has been corrected and the illustration of data reduction updated accordingly.]

Why More Mass Killings in 2013, and What It Portends for This Year

In a recent post, I noted that 2013 had distinguished itself in a dismal way, by producing more new episodes of mass killing than any other year since the early 1990s. Now let’s talk about why.

Each of these mass killings surely involves some unique and specific local processes, and people who study in depth the societies where mass killings are occurring can say much better than I what those are. As someone who believes local politics is always embedded in a global system, however, I don’t think we can fully understand these situations by considering only those idiosyncratic features, either. Sometimes we see “clusters” where they aren’t, but evidence that we live in a global system leads me to think that isn’t what’s happening here.

To fully understand why a spate of mass killings is happening now, I think it helps to recognize that this cluster is occurring alongside—or, in some cases, in concert with—a spate of state collapses and during a period of unusually high social unrest. Systemic thinking leads me to believe that these processes are interrelated in explicable ways.

Just as there are boom and bust cycles within economies, there seem to be cycles of political (dis)order in the global political economy, too. Economic crunches help spur popular unrest. Economic crunches are often regional or global in nature, and unrest can inspire imitation. These reverberating challenges can shove open doors to institutional change, but they also tend to inspire harsh responses from incumbents intent on preserving the status quo ante. The ensuing clashes present exactly the conditions that are ripest for mass killing. Foreign governments react to these clashes in various ways, sometimes to try to quell the conflict and sometimes to back a favored side. These reactions often beget further reactions, however, and efforts to manufacture a resolution can end up catalyzing wider disorder instead.

In hindsight, I don’t think it’s an accident that the last phase of comparable disorder—the early 1990s—produced two iconic yet seemingly contradictory pieces of writing on political order: Francis Fukuyama’s The End of History and the Last Man, and Robert Kaplan’s “The Coming Anarchy.” A similar dynamic seems to be happening now. Periods of heightened disorder bring heightened uncertainty, with many possibilities both good and bad. All good things do not necessarily arrive together, and the disruptions that are producing some encouraging changes in political institutions at the national and global levels also open the door to horrifying violence.

Of course, in political terms, calendar years are an entirely arbitrary delineation of time. The mass killings I called out in that earlier post weren’t all new in 2013, and the processes generating them don’t reset with the arrival of a new year. In light of the intensification and spread of the now-regional war in Syria; escalating civil wars in Pakistan, Iraq, and AfghanistanChina’s increasingly precarious condition; and the persistence of economic malaise in Europe, among other things, I think there’s a good chance that we still haven’t reached the peak of the current phase of global disorder. And, on mass killing in particular, I suspect that the persistence of this phase will probably continue to produce new episodes at a faster rate than we saw in the previous 20 years.

That’s the bad news. The slightly better news is that, while we (humanity) still aren’t nearly as effective at preventing mass killings as we’d like to be, there are signs that we’re getting better at it. In a recent post on United to End Genocide’s blog, Daniel Sullivan noted “five successes in genocide prevention in 2013,” and I think his list is a good one. Political scientist Bear Braumoeller encourages us to think of the structure of the international system as distributions of features deemed important by the major actors in it. Refracting Sullivan’s post through that lens, we can see how changes in the global distribution of political regime types, of formal and informal interdependencies among states, of ideas about atrocities prevention, and of organizations devoted to advocating for that cause seem to be enabling changes in responses to these episodes that are helping to stop or slow some of them sooner, making them somewhat less deadly on the whole.

The Central African Republic is a telling example. Attacks and clashes there have probably killed thousands over the past year, and even with deeper foreign intervention, the fighting hasn’t yet stopped. Still, in light of the reports we were receiving from people on the scene in early December (see here and here, for example), it’s easy to imagine this situation having spiraled much further downward already, had French forces and additional international assistance not arrived when they did. A similar process may be occurring now in South Sudan. Both cases already involve terrible violence on a large scale, but we should also acknowledge that both could have become much worse—and very likely will, if the braking efforts underway are not sustained or even intensified.

Follow

Get every new post delivered to your Inbox.

Join 5,754 other followers

%d bloggers like this: