Occupy Central and the Rising Risk of New Mass Atrocities in China

This is a cross-post from the blog of the Early Warning Project, which I currently direct. The Early Warning Project concentrates on risks of mass atrocities, but this post also draws on my longstanding interest in democratization and social unrest, so I thought I would share it here as well.

Activists have massed by the thousands in central Hong Kong for the past several days in defiance of repeated attempts to disperse them and menacing words from Beijing. This demonstration and the wider Occupy Central movement from which it draws poses one of the sharpest public challenges to Communist Party authority since the Tiananmen Square uprising 25 years ago. In so doing, it clearly raises the risk of a new mass atrocities in China.

Photo credit: AP via BBC News

Photo credit: AP via BBC News

The demonstrations underway now are really just the latest surge in a wave of activism that began in Hong Kong earlier this year. Under the “one country, two systems” framework to which China committed when it regained sovereignty over the then–UK colony in 1997, Hong Kong is supposed to enjoy a great deal of autonomy over local governance. This summer, however, Beijing issued a white paper affirming the central government’s “comprehensive jurisdiction” over Hong Kong, and it blocked plans for open nominations in local elections due in 2017. Those actions spurred (and were spurred by) an unofficial referendum and a mass pro-democracy rally that eventually ebbed from the streets but left behind a strengthened civic movement.

The ongoing demonstrations began with a student boycott of classes a week ago, but they escalated sharply on Friday, when activists began occupying key public spaces in central Hong Kong. Police have made several forceful attempts to disperse or remove the protesters, and official channels have said publicly that Beijing “firmly opposes all illegal activities that could undermine rule of law and jeopardise ‘social tranquility'” in Hong Kong. So far, however, the occupations have proved resilient to those thrusts and threats.

Many observers are now openly wondering how this confrontation will end. For those sympathetic to the protesters, the fear is that Beijing will respond with lethal force, as it did at Tiananmen Square in 1989.

As it happens, the Early Warning Project’s statistical risk assessments do not identify China as a country at relatively high risk of state-led mass killing this year. Partly because of that, we do not currently have a question open on our opinion pool that covers this situation. (Our lone China question focuses on the risk of state-led mass atrocities targeting Uyghurs.)

If we did have a relevant question open on our opinion pool, however, I would be raising my estimate of the risk of a state-led mass killing in response to these developments. I still don’t expect that one will occur, but not because I anticipate that Beijing will concede to the protesters’ demands. Rather, I expect violent repression, but I also doubt that it will cross the 1,000-death threshold we and others use to distinguish episodes of mass killing from smaller-scale and more routine atrocities.

State-led mass killings as we define them usually occur when incumbent rulers perceive potentially existential threats to their authority. Following leading theories on the subject, our statistical analysis concentrates on armed insurgencies and coups as the forms those threats typically take. Authoritarian governments often suppress swelling demonstrations with violence as well, but those crackdowns rarely kill as many as 1,000 nonviolent protesters, who usually disperse long before that threshold is reached. Even the Tiananmen Square massacre probably fell short of this threshold, killing “only” hundreds of activists before achieving the regime’s goal of dispersing the occupation and setting an example that would frighten future dissenters.

Instead, violent state crackdowns usually push countries onto one of three other pathways before they produce more than 1,000 fatalities: 1) they succeed at breaking the uprising and essentially restore the status quo ante (e.g., China in 1989, Uzbekistan in 2005Burma in 2007, and Thailand in 2010); 2) they suppress the nonviolent challenge but, in so doing, help to spawn a violent rebellion that may or may not be met with a mass killing of its own (e.g., Syria since 2011); or 3) they catalyze splits in state security forces or civilian rulers that lead to negotiations, reforms, or regime collapse (e.g., Egypt and Tunisia in 2011). In short, nonviolent uprisings usually lose, transform, or win before the attempts to suppress them amount to what we would call a state-led mass killing.

In Hong Kong right now, the first path—successful repression—appears to be the most likely. Chinese Communist Party leaders have spoken openly in recent years about trying to learn from the mistakes that led to collapse of the Soviet Union, and the mixed signals that were sent to early risers in the USSR—some protests were repressed, but others were allowed to run their course or met with modest concessions—probably rank high on their list of things to avoid. Those Party leaders also know that activists and separatists elsewhere in China are closely watching events in Hong Kong and would probably take encouragement from anything short of a total defeat for Occupy Central. These considerations generate strong incentives to try to quash the current challenge.

In contrast, the second of those three trajectories—a transformation to violent insurgency in response to state repression—seems highly unlikely. Protesters have shown a strong commitment to nonviolence so far and have strategic as well as ideological reasons to continue to do so; after all, the People’s Liberation Army is about as formidable a foe as they come. Brutal state repression might radicalize some survivors and inspire other onlookers, but Hong Kong is a wealthy, urban enclave with minimal access to arms, so a turn toward violent rebellion would face tall structural obstacles.

The third of those trajectories also seems unlikely, albeit somewhat less so than the second. The Communist Party currently faces several profound challenges: a slowing rate of economic growth and widespread concern about a looming financial crisis; an escalating insurgency in Xinjiang; and an epidemic of local protests over pollution, to name just a few. Meanwhile, Xi Jinping’s anti-corruption campaign is creating new fissures within the country’s ruling class, and rumors of dissent within the military have swirled occasionally in the past two years as well. As I discussed in a recent blog post, consolidated single-party regimes like China’s usually weather these kinds of challenges. When they do break down, however, it almost always happens in times like these, when worried insiders start to fight among themselves and form alliances with emboldened popular challengers.

Put those considerations together, and it seems that Beijing is most likely to respond to Occupy Central with a crackdown that could be lethal but probably will not cross the 1,000-death threshold we use to distinguish episodes of mass killing from more routine political violence. It seems less likely but still possible that the prospect or occurrence of such a crackdown will catalyze the kinds of elite splits that could finally produce significant political reform or sustained instability in China. Under none of these circumstances would I expect the challenge in Hong Kong to evolve into an armed rebellion that might produce a new wave of atrocities of its own.

No matter what the immediate outcome, though, it seems increasingly clear that China has entered a period of “thickened history,” as Marc Beissinger calls it, in which national politics will remain more eventful and less certain for some time to come.

The Rwanda Enigma

For analysts and advocates trying to assess risks of future mass atrocities in hopes of preventing them, Rwanda presents an unusual puzzle. Most of the time, specialists in this field readily agree on which countries are especially susceptible to genocide or mass killing, either because those countries are either already experiencing large-scale civil conflict or because they are widely considered susceptible to it. Meanwhile, countries that sustain long episodes of peace and steadily grow their economies are generally presumed to have reduced their risk and eventually to have escaped this trap for good.

Contemporary Rwanda is puzzling because it provokes a polarized reaction. Many observers laud Rwanda as one of Africa’s greatest developmental successes, but others warn that it remains dangerously prone to mass atrocities. In a recent essay for African Arguments on how the Rwandan genocide changed the world, Omar McDoom nicely encapsulates this unusual duality:

What has changed inside Rwanda itself since the genocide? The country has enjoyed a remarkable period of social stability. There has not been a serious incident of ethnic violence in Rwanda for nearly two decades. Donors have praised the country’s astonishing development.  Economic growth has averaged over 6% per year, poverty and inequality have declined, child and maternal mortality have improved, and primary education is now universal and free. Rwanda has shown, in defiance of expectations, that an African state can deliver security, public services, and rising prosperity.

Yet, politically, there is some troubling continuity with pre-genocide Rwanda. Power remains concentrated in the hands of a small, powerful ethnic elite led by a charismatic individual with authoritarian tendencies. In form, current president Paul Kagame and his ruling party, the RPF, the heroes who ended the genocide, appear to exercise power in a manner similar to former president Juvenal Habyarimana and his ruling MRND party, the actors closely-tied to those who planned the slaughter. The genocide is testament to what unconstrained power over Rwanda’s unusually efficient state machinery can enable.

That duality also emerges from a comparison of two recent quantitative rankings. On the one hand, The World Bank now ranks Rwanda 32nd on the latest edition of its “ease of doing business” index—not 32nd in Africa, but 32nd of 189 countries worldwide. On the other hand, statistical assessments of the risk of an onset of state-led mass killing identify Rwanda as one of the 25 countries worldwide currently most vulnerable to this kind of catastrophe.

How can both of these things be true? To answer that question, we need to have a clearer sense of where that statistical risk assessment comes from. The number that ranks Rwanda among the 25 countries most susceptible to state-led mass killing is actually an average of forecasts from three models representing a few different ideas about the origins of mass atrocities, all applied to publicly available data from widely used sources.

  • Drawing on work by Barbara Harff and the Political Instability Task Force, the first model emphasizes features of countries’ national politics that hint at a predilection to commit genocide or “politicide,” especially in the context of political instability. Key risk factors in Harff’s model include authoritarian rule, the political salience of elite ethnicity, evidence of an exclusionary elite ideology, and international isolation as measured by trade openness.
  • The second model takes a more instrumental view of mass killing. It uses statistical forecasts of future coup attempts and new civil wars as proxy measures of things that could either spur incumbent rulers to lash out against threats to their power or usher in an insecure new regime that might do the same.
  • The third model is really not a model but a machine-learning process called Random Forests applied to the risk factors identified by the other two. The resulting algorithm is an amalgamation of theory and induction that takes experts’ beliefs about the origins of mass killing as its jumping-off point but also leaves more room for inductive discovery of contingent effects.

All of these models are estimated from historical data that compares cases where state-led mass killings occurred to ones where they didn’t. In essence, we look to the past to identify patterns that will help us spot cases at high risk of mass killing now and in the future. To get our single-best risk assessment—the number that puts Rwanda in the top (or bottom) 25 worldwide—we simply average the forecasts from these three models. We prefer the average to a single model’s output because we know from work in many fields—including meteorology and elections forecasting—that this “ensemble” approach generally produces more accurate assessments than we could expect to get from any one model alone. By combining forecasts, we learn from all three perspectives and hedge against the biases of any one of them.

Rwanda lands in the top 25 worldwide because all three models identify it as a relatively high-risk case. It ranks 15th on the PITF/Harff model, 28th on the “elite threat” model, and 30th on the Random Forest. The PITF/Harff model sees a relatively low risk in Rwanda of the kinds of political instability that typically trigger onsets of genocide or politicide, but it also pegs Rwanda as the kind of regime most likely to resort to mass atrocities if instability were to occur—namely, an autocracy in which elites’ ethnicity is politically salient in a country with a recent history of genocide. Rwanda also scores fairly high on the “elite threat” model because, according to our models of these things, it is at relatively high risk of a new insurgency and moderate risk of a coup attempt. Finally, the Random Forest sees a very low probability of mass killing onset in Rwanda but still pegs it as a riskier case than most.

Our identification of Rwanda as a relatively high-risk case is echoed by some, but not all, of the other occasional global assessments of countries’ susceptibility to mass atrocities. In her own applications of her genocide/politicide model for the task of early warning, Barbara Harff pegged Rwanda as one of the world’s riskiest cases in 2011 but not in 2013. Similarly, the last update of Genocide Watch’s Countries at Risk Report, in 2012, lists Rwanda as one of more than a dozen countries at stage five of seven on the path to genocide, putting it among the 35 countries worldwide at greatest risk. By contrast, the Global Centre for the Responsibility to Protect has not identified Rwanda as a situation of concern in any of its R2P Monitor reports to date, and the Sentinel Project for Genocide Prevention does not list Rwanda among its situations of concern, either. Meanwhile, recent reporting on Rwanda from Human Rights Watch has focused mostly on the pursuit of justice for the 1994 genocide and other kinds of human-rights violations in contemporary Rwanda.

To see what our own pool of experts makes of our statistical risk assessment and to track changes in their views over time, we plan to add a question to our “wisdom of (expert) crowds” forecasting system asking about the prospect of a new state-led mass killing in Rwanda before 2015. If one does not happen, as we hope and expect will be the case, we plan to re-launch the question at the start of next year and will continue to do so as long as our statistical models keep identifying it as a case of concern.

In the meantime, I thought it would be useful to ask a few country experts what they make of this assessment and how a return to mass killing in Rwanda might come about. Some were reluctant to speak on the record, and understandably so. The present government of Rwanda has a history of intimidating individuals it perceives as its critics. As Michaela Wrong describes in a recent piece for Foreign Policy,

A U.S. State Department spokesperson said in mid-January, “We are troubled by the succession of what appear to be politically motivated murders of prominent Rwandan exiles. President Kagame’s recent statements about, quote, ‘consequences’ for those who betray Rwanda are of deep concern to us.”

It is a pattern that suggests the Rwandan government may have come to see the violent silencing of critics—irrespective of geographical location and host country—as a beleaguered country’s prerogative.

Despite these constraints, the impression I get from talking to some experts and reading the work of others is that our risk assessment strikes nearly all of them as plausible. None said that he or she expects an episode of state-led mass killing to begin soon in Rwanda. Consistent with the thinking behind our statistical models, though, many seem to believe that another mass killing could occur in Rwanda, and if one did, it would almost certainly come in reaction to some other rupture in that country’s political stability.

Filip Reyntjens, a professor at the University of Antwerpen who wrote a book on Rwandan politics since the 1994 genocide, was both the most forthright and the most pessimistic in his assessment. Via email, he described Rwanda as

A volcano waiting to erupt. Nearly all field research during the last 15 years points at pervasive structural violence that may, as we know, become physical, acute violence following a trigger. I don’t know what that trigger will be, but I think a palace revolution or a coup d’etat is the most likely scenario. That may create a situation difficult to control.

In a recent essay for Juncture that was adapted for the Huffington Post (here), Phil Clark sounds more optimistic than Reyntjens, but he is not entirely sanguine, either. Clark sees the structure and culture of the country’s ruling party, the Rwandan Patriotic Front (RPF), as the seminal feature of Rwandan politics since the genocide and describes it as a double-edged sword. On the one hand, the RPF’s cohesiveness and dedication to purpose has enabled it, with help from an international community with a guilty conscience, to make “enormous” developmental gains. On the other hand,

The RPF’s desire for internal cohesion has made it suspicious of critical voices within and outside of the party—a feature compounded by Rwanda’s fraught experience of multi-party democracy in the early 1990s, which saw the rise of ethnically driven extremist parties and helped to create an environment conducive to genocide. The RPF’s singular focus on rebuilding the nation and facilitating the return of refugees means it has often viewed dissent as an unaffordable distraction. The disastrous dalliance with multipartyism before the genocide has only added to the deep suspicion of policy based on the open contestation of ideas.

Looking ahead, Clark wonders what happens when that intolerance for dissent bumps up against popular frustrations, as it probably will at some point:

For the moment, there are few signs of large-scale popular discontent with the closed political space. However, any substantial decline in socio-economic conditions in the countryside will challenge this. The RPF’s gamble appears to be that the population will tolerate a lack of national political contestation provided domestic stability and basic living standards are maintained. For now, the RPF seems to have rightly judged the popular mood but that situation may not hold.

Journalist Kris Berwouts portrays similarly ambiguous terrain in a recent piece for the Dutch magazine Mo that also appeared on the blog African Arguments (here). Berwouts quotes David Himbara, a former Rwandan regime insider who left the country in 2010 and has vocally criticized the Kagame government ever since, as telling him that “all society has vanished from Rwanda, mistrust is complete. It has turned Rwanda into a time bomb.” But Berwouts juxtaposes that dire assessment with the cautiously optimistic view of Belgian journalist Marc Hoogsteyns, who has worked in the region for years and has family ties by marriage to its Tutsi community. According to Hoogsteyns,

Rwanda is a beautiful country with many strengths and opportunities, but at the same time it is some kind of African version of Brave New World. People are afraid to talk. But they live more comfortably and safely than ever before, they enjoy high quality education and health care. They are very happy with that. The Tutsi community stands almost entirely behind Kagame and also most Hutu can live with it. They obviously don’t like the fact that they do not count on the political scene, but they can do what they want in all other spheres of live. They can study and do business etcetera. They can deal with the level of repression, because they know that countries such as Burundi, Congo or Kenya are not the slightest bit more democratic. Honestly, if we would have known twenty years ago, just after the genocide, that Rwanda would achieve this in two decades, we would have signed for it immediately.

As people of a certain age in places like Sarajevo or Bamako might testify, though, stability is a funny thing. It’s there until it isn’t, and when it goes, it sometimes goes quickly. In this sense, the political crises that sometimes produce mass killings are more like earthquakes than elections. We can spot the vulnerable structures fairly accurately, but we’re still not very good at anticipating the timing and dynamics of ruptures in them.

In the spirit of that last point, it’s important to acknowledge that the statistical assessment of Rwanda’s risk to mass killing is a blunt piece of information. Although it does specifically indicate a susceptibility to atrocities perpetrated by state security forces or groups acting at their behest, it does not necessarily implicate the RPF as the likely perpetrators. The qualitative assessments discussed above suggest that some experts find that scenario plausible, but it isn’t the only one consistent with our statistical finding. A new regime brought to power by coup or revolution could also become the agent of a new wave of mass atrocities in Rwanda, and the statistical forecast would be just as accurate.

Egypt’s recent past offers a case in point. Our statistical assessments of susceptibility to state-led mass killing in early 2013 identified Egypt as a relatively high-risk case, like Rwanda now. At the time, Mohammed Morsi was president, and one plausible interpretation of that risk assessment might have centered on the threat the Muslim Brotherhood’s supporters posed to Egypt’s Coptic Christians. Fast forward to July 2013, and the mass killing we ended up seeing in Egypt came at the hands of an army and police who snatched power away from Morsi and the Brotherhood and then proceeded to kill hundreds of their unarmed sympathizers. That outcome doesn’t imply that Coptic Christians weren’t at grave risk before the coup, but it should remind us to consider a variety of ways these systemic risks might become manifest.

Still, after conversations with a convenience sample of regional experts, I am left with the impression that the risk our statistical models identify of a new state-led mass killing in Rwanda is real, and that it is possible to imagine the ruling RPF as the agents of such violence.

No one seems to expect the regime to engage in mass violence without provocation, but the possibility of a new Hutu insurgency, and the state’s likely reaction to it, emerged from those conversations as perhaps the most likely scenario. According to some of the experts with whom I spoke, many Rwandan Hutus are growing increasingly frustrated with the RPF regime, and some radical elements of the Hutu diaspora appear to be looking for ways to take up that mantle. The presence of an insurgency is the single most-powerful predictor of state-led mass killing, and it does not seem far fetched to imagine the RPF regime using “scorched earth” tactics in response to the threat or occurrence of attacks on its soldiers and Tutsi citizens. After all, this is the same regime whose soldiers pursued Hutu refugees into Zaire in the mid-1990s and, according to a 2010 U.N. report, participated in the killings of tens of thousands of civilians in war crimes that were arguably genocidal.

Last but not least, we can observe that Rwanda has suffered episodes of mass killing roughly once per generation since independence—in the early 1960s, in 1974, and again in the early 1990s, culminating in the genocide of 1994 and the reprisal killings that followed. History certainly isn’t destiny, but our statistical models confirm that in the case of mass atrocities, it often rhymes.

It saddens me to write this piece about a country that just marked the twentieth anniversary of one of the most lethal genocides since the Holocaust, but the point of our statistical modeling is to see what the data say that our mental models and emotional assessments might overlook. A reprisal of mass killing in Rwanda would be horribly tragic. As Free Africa Foundation president George Ayittey wrote in a recent letter of the Wall Street Journal, however, “The real tragedy of Rwanda is that Mr. Kagame is so consumed by the 1994 genocide that, in his attempt to prevent another one, he is creating the very conditions that led to it.”

Watch Experts’ Beliefs Evolve Over Time

On 15 December 2013, “something” happened in South Sudan that quickly began to spiral into a wider conflict. Prior research tells us that mass killings often occur on the heels of coup attempts and during civil wars, and at the time South Sudan ranked among the world’s countries at greatest risk of state-led mass killing.

Motivated by these two facts, I promptly added a question about South Sudan to the opinion pool we’re running as part of a new atrocities early-warning system for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide (see this recent post for more on that). As it happened, we already had one question running about the possibility of a state-led mass killing in South Sudan targeting the Murle, but the spiraling conflict clearly implied a host of other risks. Posted on 18 December 2013, the new question asked, “Before 1 January 2015, will an episode of mass killing occur in South Sudan?”

The criteria we gave our forecasters to understand what we mean by “mass killing” and how we would decide if one has happened appear under the Background Information header at the bottom of this post. Now, shown below is an animated sequence of kernel density plots of each day’s forecasts from all participants who’d chosen to answer this question. A kernel density plot is like a histogram, but with some nonparametric estimation thrown in to try to get at the distribution of a variable’s “true” values from the sample of observations we’ve got. If that sound like gibberish to you, just think of the peaks in the plots as clumps of experts who share similar beliefs about the likelihood of mass killing in South Sudan. The taller the peak, the bigger the clump. The farther right the peak, the more likely that clump thinks a mass killing is.


I see a couple of interesting patterns in those plots. The first is the rapid rightward shift in the distribution’s center of gravity. As the fighting escalated and reports of atrocities began to trickle in (see here for one much-discussed article from the time), many of our forecasters quickly became convinced that a mass killing would occur in South Sudan in the coming year, if one wasn’t occurring already. On 23 December—the date that aforementioned article appeared—the average forecast jumped to approximately 80 percent, and it hasn’t fallen below that level since.

The second pattern that catches my eye is the appearance in January of a long, thin tail in the distribution that reaches into the lower ranges. That shift in the shape of the distribution coincides with stepped-up efforts by U.N. peacekeepers to stem the fighting and the start of direct talks between the warring parties. I can’t say for sure what motivated that shift, but it looks like our forecasters split in their response to those developments. While most remained convinced that a mass killing would occur or had already, a few forecasters were apparently more optimistic about the ability of those peacekeepers or talks or both to avert a full-blown mass killing. A few weeks later, it’s still not clear which view is correct, although a forthcoming report from the U.N. Mission in South Sudan may soon shed more light on this question.

I think this set of plots is interesting on its face for what it tells us about the urgent risk of mass atrocities in South Sudan. At the same time, I also hope this exercise demonstrates the potential to extract useful information from an opinion pool beyond a point-estimate forecast. We know from prior and ongoing research that those point estimates can be quite informative in their own right. Still, by looking at the distribution of participant’s forecasts on a particular question, we can glean something about the degree of uncertainty around an event of interest or concern. By looking for changes in that distribution over time, we can also get a more complete picture of how the group’s beliefs evolve in response to new information than a simple line plot of the average forecast could ever tell us. Look for more of this work as our early-warning system comes online, hopefully in the next few months.

UPDATE (7 Feb): At the urging of Trey Causey, I tried making another version of this animation in which the area under the density plot is filled in. I also decided to add a vertical line to show each day’s average forecast, which is what we currently report as the single-best forecast at any given time. Here’s what that looks like, using data from a question on the risk of a mass killing occurring in the Central African Republic before 2015. We closed this question on 19 December 2013, when it became clear through reporting by Human Rights Watch and others that an episode of mass killing has occurred.


Background Information

We will consider a mass killing to have occurred when the deliberate actions of state security forces or other armed groups result in the deaths of at least 1,000 noncombatant civilians over a period of one year or less.

  • A noncombatant civilian is any person who is not a current member of a formal or irregular military organization and who does not apparently pose an immediate threat to the life, physical safety, or property of other people.
  • The reference to deliberate actions distinguishes mass killing from deaths caused by natural disasters, infectious diseases, the accidental killing of civilians during war, or the unanticipated consequences of other government policies. Fatalities should be considered intentional if they result from actions designed to compel or coerce civilian populations to change their behavior against their will, as long as the perpetrators could have reasonably expected that these actions would result in widespread death among the affected populations. Note that this definition also covers deaths caused by other state actions, if, in our judgment, perpetrators enacted policies/actions designed to coerce civilian population and could have expected that these policies/actions would lead to large numbers of civilian fatalities. Examples of such actions include, but are not limited to: mass starvation or disease-related deaths resulting from the intentional confiscation, destruction, or medicines or other healthcare supplies; and deaths occurring during forced relocation or forced labor.
  • To distinguish mass killing from large numbers of unrelated civilian fatalities, the victims of mass killing must appear to be perceived by the perpetrators as belonging to a discrete group. That group may be defined communally (e.g., ethnic or religious), politically (e.g., partisan or ideological), socio-economically (e.g., class or professional), or geographically (e.g., residents of specific villages or regions). In this way, apparently unrelated executions by police or other state agents would not qualify as mass killing, but capital punishment directed against members of a specific political or communal group would.

The determination of whether or not a mass killing has occurred will be made by the administrators of this system using publicly available secondary sources and in consultation with subject-matter experts. Relevant evidence will be summarized in a blog post published when the determination is announced, and any dissenting views will be discussed as well.

A New Statistical Approach to Assessing Risks of State-Led Mass Killing

Which countries around the world are currently at greatest risk of an onset of state-led mass killing? At the start of the year, I posted results from a wiki survey that asked this question. Now, here in heat-map form are the latest results from a rejiggered statistical process with the same target. You can find a dot plot of these data at the bottom of the post, and the data and code used to generate them are on GitHub.

Estimated Risk of New Episode of State-Led Mass Killing

These assessments represent the unweighted average of probabilistic forecasts from three separate models trained on country-year data covering the period 1960-2011. In all three models, the outcome of interest is the onset of an episode of state-led mass killing, defined as any episode in which the deliberate actions of state agents or other organizations kill at least 1,000 noncombatant civilians from a discrete group. The three models are:

  • PITF/Harff. A logistic regression model approximating the structural model of genocide/politicide risk developed by Barbara Harff for the Political Instability Task Force (PITF). In its published form, the Harff model only applies to countries already experiencing civil war or adverse regime change and produces a single estimate of the risk of a genocide or politicide occurring at some time during that crisis. To build a version of the model that was more dynamic, I constructed an approximation of the PITF’s global model for forecasting political instability and use the natural log of the predicted probabilities it produces as an additional input to the Harff model. This approach mimics the one used by Harff and Ted Gurr in their ongoing application of the genocide/politicide model for risk assessment (see here).
  • Elite Threat. A logistic regression model that uses the natural log of predicted probabilities from two other logistic regression models—one of civil-war onset, the other of coup attempts—as its only inputs. This model is meant to represent the argument put forth by Matt Krain, Ben Valentino, and others that states usually engage in mass killing in response to threats to ruling elites’ hold on power.
  • Random Forest. A machine-learning technique (see here) applied to all of the variables used in the two previous models, plus a few others of possible relevance, using the ‘randomforest‘ package in R. A couple of parameters were tuned on the basis of a gridded comparison of forecast accuracy in 10-fold cross-validation.

The Random Forest proved to be the most accurate of the three models in stratified 10-fold cross-validation. The chart below is a kernel density plot of the areas under the ROC curve for the out-of-sample estimates from that cross-validation drill. As the chart shows, the average AUC for the Random Forest was in the low 0.80s, compared with the high 0.70s for the PITF/Harff and Elite Threat models. As expected, the average of the forecasts from all three performed even better than the best single model, albeit not by much. These out-of-sample accuracy rates aren’t mind blowing, but they aren’t bad either, and they are as good or better than many of the ones I’ve seen from similar efforts to anticipate the onset of rare political crises in countries worldwide.


Distribution of Out-of-Sample AUC Scores by Model in 10-Fold Cross-Validation

The decision to use an unweighted average for the combined forecast might seem simplistic, but it’s actually a principled choice in this instance. When examples of the event of interest are hard to come by and we have reason to believe that the process generating those events may be changing over time, sticking with an unweighted average is a reasonable hedge against risks of over-fitting the ensemble to the idiosyncrasies of the test set used to tune it. For a longer discussion of this point, see pp. 7-8 in the last paper I wrote on this work and the paper by Andreas Graefe referenced therein.

Any close readers of my previous work on this topic over the past couple of years (see here and here) will notice that one model has been dropped from the last version of this ensemble, namely, the one proposed by Michael Colaresi and Sabine Carey in their 2008 article, “To Kill or To Protect” (here). As I was reworking my scripts to make regular updating easier (more on that below), I paid closer attention than I had before to the fact that the Colaresi and Carey model requires a measure of the size of state security forces that is missing for many country-years. In previous iterations, I had worked around that problem by using a categorical version of this variable that treated missingness as a separate category, but this time I noticed that there were fewer than 20 mass-killing onsets in country-years for which I had a valid observation of security-force size. With so few examples, we’re not going to get reliable estimates of any pattern connecting the two. As it happened, this model—which, to be fair to its authors, was not designed to be used as a forecasting device—was also by far the least accurate of the lot in 10-fold cross-validation. Putting two and two together, I decided to consign this one to the scrap heap for now. I still believe that measures of military forces could help us assess risks of mass killing, but we’re going to need more and better data to incorporate that idea into our multimodel ensemble.

The bigger and in some ways more novel change from previous iterations of this work concerns the unorthodox approach I’m now using to make the risk assessments as current as possible. All of the models used to generate these assessments were trained on country-year data, because that’s the only form in which most of the requisite data is produced. To mimic the eventual forecasting process, the inputs to those models are all lagged one year at the model-estimation stage—so, for example, data on risk factors from 1985 are compared with outcomes in 1986, 1986 inputs to 1987 outcomes, and so on.

If we stick rigidly to that structure at the forecasting stage, then I need data from 2013 to produce 2014 forecasts. Unfortunately, many of the sources for the measures used in these models won’t publish their 2013 data for at least a few more months. Faced with this problem, I could do something like what I aim to do with the coup forecasts I’ll be producing in the next few days—that is, only use data from sources that quickly and reliably update soon after the start of each year. Unfortunately again, though, the only way to do that would be to omit many of the variables most specific to the risk of mass atrocities—things like the occurrence of violent civil conflict or the political salience of elite ethnicity.

So now I’m trying something different. Instead of waiting until every last input has been updated for the previous year and they all neatly align in my rectangular data set, I am simply applying my algorithms to the most recent available observation of each input. It took some trial and error to write, but I now have an R script that automates this process at the country level by pulling the time series for each variable, omitting the missing values, reversing the series order, snipping off the observation at the start of that string, collecting those snippets in a new vector, and running that vector through the previously estimated model objects to get a forecast (see the section of this starting at line 284).

One implicit goal of this approach is to make it easier to jump to batch processing, where the forecasting engine routinely and automatically pings the data sources online and updates whenever any of the requisite inputs has changed. So, for example, when in a few months the vaunted Polity IV Project releases its 2013 update, my forecasting contraption would catch and ingest the new version and the forecasts would change accordingly. I now have scripts that can do the statistical part but am going to be leaning on other folks to automate the wider routine as part of the early-warning system I’m helping build for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide.

The big upside of this opportunistic approach to updating is that the risk assessments are always as current as possible, conditional on the limitations of the available data. The way I figure, when you don’t have information that’s as fresh as you’d like, use the freshest information you’ve got.

The downside of this approach is that it’s not clear exactly what the outputs from that process represent. Technically, a forecast is a probabilistic statement about the likelihood of a specific event during a specific time period. The outputs from this process are still probabilistic statements about the likelihood of a specific event, but they are no longer anchored to a specific time period. The probabilities mapped at the top of this post mostly use data from 2012, but the inputs for some variables for some cases are a little older, while the inputs for some of the dynamic variables (e.g., GDP growth rates and coup attempts) are essentially current. So are those outputs forecasts for 2013, or for 2014, or something else?

For now, I’m going with “something else” and am thinking of the outputs from this machinery as the most up-to-date statistical risk assessments I can produce, but not forecasts as such. That description will probably sound like fudging to most statisticians, but it’s meant to be an honest reflection of both the strengths and limitations of the underlying approach.

Any gear heads who’ve read this far, I’d really appreciate hearing your thoughts on this strategy and any ideas you might have on other ways to resolve this conundrum, or any other aspect of this forecasting process. As noted at the top, the data and code used to produce these estimates are posted online. This work is part of a soon-to-launch, public early-warning system, so we hope and expect that they will have some effect on policy and advocacy planning processes. Given that aim, it behooves us to do whatever we can to make them as accurate as possible, so I would very much welcome any suggestions on how to do or describe this better.

Finally and as promised, here is a dot plot of the estimates mapped above. Countries are shown in descending order by estimated risk. The gray dots mark the forecasts from the three component models, and the red dot marks the unweighted average.


PS. In preparation for a presentation on this work at an upcoming workshop, I made a new map of the current assessments that works better, I think, than the one at the top of this post. Instead of coloring by quintiles, this new version (below) groups cases into several bins that roughly represent doublings of risk: less than 1%, 1-2%, 2-4%, 4-8%, and 8-16%. This version more accurately shows that the vast majority of countries are at extremely low risk and more clearly shows variations in risk among the ones that are not.

Estimated Risk of New State-Led Mass Killing

Estimated Risk of New State-Led Mass Killing

Why More Mass Killings in 2013, and What It Portends for This Year

In a recent post, I noted that 2013 had distinguished itself in a dismal way, by producing more new episodes of mass killing than any other year since the early 1990s. Now let’s talk about why.

Each of these mass killings surely involves some unique and specific local processes, and people who study in depth the societies where mass killings are occurring can say much better than I what those are. As someone who believes local politics is always embedded in a global system, however, I don’t think we can fully understand these situations by considering only those idiosyncratic features, either. Sometimes we see “clusters” where they aren’t, but evidence that we live in a global system leads me to think that isn’t what’s happening here.

To fully understand why a spate of mass killings is happening now, I think it helps to recognize that this cluster is occurring alongside—or, in some cases, in concert with—a spate of state collapses and during a period of unusually high social unrest. Systemic thinking leads me to believe that these processes are interrelated in explicable ways.

Just as there are boom and bust cycles within economies, there seem to be cycles of political (dis)order in the global political economy, too. Economic crunches help spur popular unrest. Economic crunches are often regional or global in nature, and unrest can inspire imitation. These reverberating challenges can shove open doors to institutional change, but they also tend to inspire harsh responses from incumbents intent on preserving the status quo ante. The ensuing clashes present exactly the conditions that are ripest for mass killing. Foreign governments react to these clashes in various ways, sometimes to try to quell the conflict and sometimes to back a favored side. These reactions often beget further reactions, however, and efforts to manufacture a resolution can end up catalyzing wider disorder instead.

In hindsight, I don’t think it’s an accident that the last phase of comparable disorder—the early 1990s—produced two iconic yet seemingly contradictory pieces of writing on political order: Francis Fukuyama’s The End of History and the Last Man, and Robert Kaplan’s “The Coming Anarchy.” A similar dynamic seems to be happening now. Periods of heightened disorder bring heightened uncertainty, with many possibilities both good and bad. All good things do not necessarily arrive together, and the disruptions that are producing some encouraging changes in political institutions at the national and global levels also open the door to horrifying violence.

Of course, in political terms, calendar years are an entirely arbitrary delineation of time. The mass killings I called out in that earlier post weren’t all new in 2013, and the processes generating them don’t reset with the arrival of a new year. In light of the intensification and spread of the now-regional war in Syria; escalating civil wars in Pakistan, Iraq, and AfghanistanChina’s increasingly precarious condition; and the persistence of economic malaise in Europe, among other things, I think there’s a good chance that we still haven’t reached the peak of the current phase of global disorder. And, on mass killing in particular, I suspect that the persistence of this phase will probably continue to produce new episodes at a faster rate than we saw in the previous 20 years.

That’s the bad news. The slightly better news is that, while we (humanity) still aren’t nearly as effective at preventing mass killings as we’d like to be, there are signs that we’re getting better at it. In a recent post on United to End Genocide’s blog, Daniel Sullivan noted “five successes in genocide prevention in 2013,” and I think his list is a good one. Political scientist Bear Braumoeller encourages us to think of the structure of the international system as distributions of features deemed important by the major actors in it. Refracting Sullivan’s post through that lens, we can see how changes in the global distribution of political regime types, of formal and informal interdependencies among states, of ideas about atrocities prevention, and of organizations devoted to advocating for that cause seem to be enabling changes in responses to these episodes that are helping to stop or slow some of them sooner, making them somewhat less deadly on the whole.

The Central African Republic is a telling example. Attacks and clashes there have probably killed thousands over the past year, and even with deeper foreign intervention, the fighting hasn’t yet stopped. Still, in light of the reports we were receiving from people on the scene in early December (see here and here, for example), it’s easy to imagine this situation having spiraled much further downward already, had French forces and additional international assistance not arrived when they did. A similar process may be occurring now in South Sudan. Both cases already involve terrible violence on a large scale, but we should also acknowledge that both could have become much worse—and very likely will, if the braking efforts underway are not sustained or even intensified.

A Notable Year of the Wrong Kind

The year that’s about to end has distinguished itself in at least one way we’d prefer never to see again. By my reckoning, 2013 saw more new mass killings than any year since the early 1990s.

When I say “mass killing,” I mean any episode in which the deliberate actions of state agents or other organizations kill at least 1,000 noncombatant civilians from a discrete group. Mass killings are often but certainly not always perpetrated by states, and the groups they target may be identified in various ways, from their politics to their ethnicity, language, or religion. Thanks to my colleague Ben Valentino, we have a fairly reliable tally of episodes of state-led mass killing around the world since the mid-1940s. Unfortunately, there is no comparable reckoning of mass killings carried out by non-state actors—nearly always rebel groups of some kind—so we can’t make statements about counts and trends as confidently as I would like. Still, we do the best we can with the information we have.

With those definitions and caveats in mind, I would say that in 2013 mass killings began:

Of course, even as these new cases have developed, episodes of mass killings have continued in a number of other places:

In a follow-up post I hope to write soon, I’ll offer some ideas on why 2013 was such a bad year for deliberate mass violence against civilians. In the meantime, if you think I’ve misrepresented any of these cases here or overlooked any others, please use the Comments to set me straight.

Mass Atrocities in South Sudan

Since December 2012, state security forces in South Sudan’s Jonglei state have “repeatedly targeted civilians” in a “series of unlawful killings” that have killed scores and displaced tens of thousands, a new report from Human Rights Watch (HRW) says.

The report documents 24 incidents of unlawful killing that left 70 civilians and 24 ethnic Murle members of the security forces dead—and those are just the incidents HRW was able to document. In situations like this, the actual numbers of victims are almost always substantially higher than what groups like HRW can verify.

In academia’s grim typology of political violence against civilians, this episode doesn’t yet qualify as a mass killing, but it seems to be headed in that direction.

This episode also happens to fits the most common scenario for state-sponsored mass killing, in which security forces attempting to suppress an insurgency end up killing large numbers of civilians in areas where rebels are thought to operate or to enjoy popular support. As the HRW report discusses, the violence in Jonglei is part of a counterinsurgency campaign against a rebel group led by David Yau Yau, an ethnic Murle who took up arms against the government of South Sudan after failing to win a seat in 2010 elections, back when South Sudan was de facto but not yet de jure independent. Ironically but also typically, the army’s abuses are proving counterproductive. As HRW notes,

Murle civilians told Human Rights Watch that an abusive army disarmament of  civilians in 2012 in Pibor county fuelled the rebellion as Murle men, angered by abuses and unwilling to give up their guns, joined Yau Yau.

The fact that the atrocities are occurring in the context of a counterinsurgency campaign doesn’t mean that the insurgency is the only cause of the violence, however. As Caelin Briggs describes in a recent blog post for Refugees International (RI),

Other likely causes of violence have little to do with Yau Yau. NGOs told RI that SPLA soldiers frequently do not receive salaries, and that they are told by commanders that goods looted from civilians count as ‘payment’. As a result, looting of both civilian and NGO property is now one of the most visible abuses perpetrated by the SPLA in Jonglei. Impunity for these crimes is so extreme that soldiers are reportedly using stolen equipment inside their own barracks. The SPLA has also deliberately vandalized NGO property – perhaps, some NGOs say, with the express purpose of making it more difficult for international staff to return.

For better and for worse, this episode of atrocities was also foreseeable. Way back in the March 2012 issue of its bimonthly R2P Monitor (PDF), the Global Centre for the Responsibility to Protect (GCR2P) noted that efforts by the government of South Sudan to stop communal violence in Jonglei state by forcibly disarming local militias could have troubling side effects. “Several prominent NGOs have documented human rights abuses carried out by the SPLA during past disarmament campaigns,” the report noted. More recently, in a set of statistical forecasts I produced using data from the end of 2012, South Sudan showed up as one of the 10 countries worldwide at greatest risk of an onset of state-sponsored mass killing in 2013.

Using Wiki Surveys to Forecast Rare Events

Pro football starts back up for real in just a few weeks. Who’s going to win the next Super Bowl?

This is a hard forecasting problem. The real action hasn’t even started yet, so most of the information we have about how a team will play this season is just extrapolated from other recent seasons, and even without big personnel changes, team performance can vary quite a bit from year to year. Also, luck plays a big role in pro football; one bad injury to a key player or one unlucky break in a playoff game can visibly bend (or end) the arc of a team’s season. I suspect that most fans could do a pretty good job sorting teams now by their expected strength, but correctly guessing exactly who will win the championship is a much tougher nut to crack.

Of course, that doesn’t mean people aren’t trying. PredictWise borrows from online betting site Betfair to give us one well-informed set of forecasts about that question. Here’s a screenshot from this morning (August 11, 2013) of PredictWise’s ten most likely winners of Super Bowl XLVIII:

predictwise 2014 super bowl forecast 20130811

I’m not trying to make the leap into sports bookmaking, but I am interested in hard forecasting questions, and I’m currently using this Super Bowl-champs question as a toy problem to explore how we might apply a relatively new crowdsourced survey technology to forecasting rare events. So far, I think the results look promising.

The technology is a pairwise wiki survey. It’s being developed by a Princeton-based research project called All Our Ideas, and according to its creators, here’s how it works:

[A pairwise wiki survey] consists of a single question with many possible answer items. Respondents can participate in a pairwise wiki survey in two ways: fi rst, they can make pairwise comparisons between items (i.e., respondents vote between item A and item B); and second, they can add new items that are then presented to future respondents.

The resulting pairwise votes are converted into aggregate ratings using a Bayesian hierarchical model that estimates collective preferences most consistent with the observed data.

Pairwise wiki surveys weren’t designed specifically for forecasting, but I think we can readily adapt them to versions of that task that involve comparisons of risk. Instead of asking which item in a pair respondents prefer, we can ask them which item in a pair is more likely. The resulting scores won’t quite be the forecasts we’re looking for, but they’ll contain a lot of useful information about rank ordering and relative likelihoods.

My Super Bowl survey has only been running for a few days now, but I’ve already got more than 1,300 votes, and the results it’s produced so far look pretty credible to me. Here’s a screenshot of the top 10 from my wiki survey on the next NFL champs, as of 8 AM on August 11, 2013. As you can see, the top 10 is nearly identical to the top 10 at Betfair—they’ve got the Bengals where my survey has the NY Giants—and even within the top 10, the teams sort pretty similarly.

allourideas 2014 super bowl survey results 20130811

The 0-100 scores in that chart aren’t estimates of the probability that a team will win the Super Bowl. Because only one team can win, those estimates would have to sum to 100 across all 32 teams in the league. Instead, the scores shown here are the model-estimated chances that each team will win if pitted against another team chosen at random. As such, they’re better thought of as estimates of relative strength with an as-yet unspecified relationship to the probability of winning the championship.

This scalar measure of relative strength will often be interesting on its own, but for forecasting applications, we’d usually prefer to have these values expressed as probabilities. Following PredictWise, we can get there by summing all the scores and treating that sum as the denominator in a likelihood ratio that behaves like a true probability. For example, when that screenshot of my wiki survey was taken, the scores across all 32 NFL teams summed to 1,607, so the estimated probability of the Atlanta Falcons winning Super Bowl XLVIII were 5.4% (87/1,607), while the chances that my younger son’s Ravens will repeat were pegged about 4.9% (79/1,607).

For problems with a unique outcome, this conversion is easy to do, because the contours of the event space are known in advance. As the immortals in Highlander would have it, “There can be only one.”

Things get tougher if we want to apply this technique to problems where there isn’t a fixed number of events—say, trying to anticipate where coups or insurgencies or mass atrocities are likely to happen in the coming year.  One way to extend the method to these kinds of problems would be to use historical data to identify the base rate of relevant events and then use that base rate as a multiplier in the transformation math as follows:

predicted probability = base rate * [ score / (sum of scores) ]

When a rare-events model is well calibrated, the sum of the predicted probabilities it produces should roughly equal the number of events that actually occur. The approach I’m proposing just works that logic in reverse, starting with the (reasonable) assumption that the base rate is a good forecast of the number of events that will occur and then inflating or deflating the estimated probabilities accordingly.

For example, say I’m interested in forecasting onsets of state-sponsored mass killing around the world, and I know the annual base rate of these events over the past few decades has only been about 1.2. I could use a pairwise wiki survey to ask respondents “Which country is more likely to experience an onset of state-sponsored mass killing in 2014?” and get scores like the ones in the All Our Ideas chart above. To convert the score for Country X to a predicted probability, I could sum the resulting scores for all countries, divide Country X’s score by that sum, and then multiply the result by 1.2.

This process might seem a bit ad hoc, but I think it’s one reasonable solution to a tough problem. In fact, this is basically the same thing that happens in a logistic regression model, which statisticians (and wannabes like me) often use to forecast discrete events. In the equation we get from a logistic regression model, the intercept captures information about the base rate and uses that as a starting point for all of the responses, which are initially expressed as log odds. The vector of predictors and the associated weights just slides the log odds up or down from that benchmark, and a final operation converts those log odds to a familiar 0-1 probability.

On the football problem, I would expect Betfair to be more accurate than my wiki survey, because Betfair’s odds are based on the behavior of people who are putting money on the line. For rare events in international politics, though, there is no Betfair equivalent. In situations where statistical modeling is inefficient or impossible—or we just want to know what some pool of respondents believe the risks are—I think this wiki-survey approach could be a useful tool.

Two Forecasting Lessons from a Crazy Football Season

My younger son is a huge fan of the Baltimore Ravens, and his enthusiasm over the past several years has converted me, so we had a lot of fun (and gut-busting anxiety) watching the Super Bowl on Sunday.

As a dad and fan, my favorite part of the night was the Baltimore win. As a forecaster, though, my favorite discovery of the night was a web site called Advanced NFL Stats, one of a budding set of quant projects applied to the game of football. Among other things, Advanced NFL Stats produces charts of the probability that either team will win every pro game in progress, including the Super Bowl. These charts are apparently based on a massive compilation of stats from games past, and they are updated in real time. As we watched the game, I could periodically refresh the page on my mobile phone and give us a fairly reliable, up-to-the-minute forecast of the game’s outcome. Since the Super Bowl confetti has settled, I’ve spent some time poking through archived charts of the Ravens’ playoff run, and that exercise got me thinking about two lessons for forecasters.

1. Improbable doesn’t mean impossible.

To get to the Super Bowl, the Ravens had to beat the Denver Broncos in the divisional round of the playoffs. Trailing by seven with 3:12 left in that game, the Ravens turned the ball over to Denver on downs at the Broncos’ 31-yard line. To win from there, the Ravens would need a turnover or quick stop; then a touchdown; then either a successful two-point conversion or a first score in overtime.

As the chart below shows, the odds of all of those things coming together were awfully slim. At that point—just before “Regulation” on the chart’s bottom axis—Advanced NFL Stats’ live win-probability graph gave the Ravens roughly a 1% chance of winning. Put another way, if the game could be run 100 times from that position, we would only expect to see Baltimore win once.


Well, guess what happened? The one-in-a-hundred event, that’s what. Baltimore got the quick stop they needed, Denver punted, Joe Flacco launched a 70-yard bomb down the right sideline to Jacoby Jones for a touchdown, the Ravens pushed the game into overtime, and two minutes into the second extra period at Mile High Stadium, Justin Tucker booted a 47-yard field goal to carry Baltimore back to the AFC Championship.

For Ravens’ fans, that outcome was a %@$# miracle. For forecasters, it was a great reminder that even highly unlikely events happen sometimes. When Nate Silver’s model indicates on the eve of the 2012 election that President Obama has a 91% chance of winning, it isn’t saying that Obama is going to win. It’s saying he’s probably going to win, and the Ravens-Broncos game reminds us that there’s an important difference. Conversely, when a statistical model of rare events like coups or mass killings identifies certain countries as more susceptible than others, it isn’t necessarily suggesting that those highest-risk cases are definitely going to suffer those calamities. When dealing with events as rare as those, even the most vulnerable cases will escape most years without a crisis.

The larger point here is one that’s been made many times but still deserves repeating: no single probabilistic forecast is plainly right and wrong. A sound forecasting process will reliably distinguish the more likely from the less likely, but it won’t attempt to tell us exactly what’s going to happen in every case. Instead, the more accurate the forecasts, the more closely the frequency of real-world outcomes or events will track the predicted probabilities assigned to them. If a meteorologist’s model is really good, we should end up getting wet roughly half of the times she tells us there’s a 50% chance of rain. And almost every time the live win-probability graph gives a football team a 99% chance of winning, they will go on to win that game—but, as my son will happily point out, not every time.

2. The “obvious” indicators aren’t always the most powerful predictors.

Take a look at the Advanced NFL Stats chart below, from Sunday’s Super Bowl. See that sharp dip on the right, close to the end? Something really interesting happened there: late in the game, Baltimore led on score (34-29) but trailed San Francisco in its estimated probability of winning (about 45%).


How could that be? Consideration of the likely outcomes of the next two possessions makes it clearer. At the time, San Francisco had a first-and-goal situation from Baltimore’s seven yard line. Teams with four shots at the end zone from seven yards out usually score touchdowns, and teams that get the ball deep in their own territory with a two- or three-point deficit and less than two minutes to play usually lose. In that moment, the live forecast confirmed the dread that Ravens fans were feeling in our guts: even though San Francisco was still trailing, the game had probably slipped away from Baltimore.

I think there’s a useful lesson for forecasters in that peculiar situation: the most direct indicators don’t tell the whole story. In football, the team with a late-game lead is usually going to win, but Advanced NFL Stats’ data set and algorithm have uncovered at least one situation where that’s not the case.

This lesson also applies to efforts to forecasts political processes, like violent conflict and regime collapse. With the former, we tend to think of low-level violence as the best predictor of future civil wars, but that’s not always true. It’s surely a valuable piece of information, but there are other sources of positive and negative feedback that might rein in incipient violence in some cases and produce sudden eruptions in others. Ditto for dramatic changes in political regimes. Eritrea, for example, recently had some sort of mutiny and North Korea did not, but that doesn’t necessarily mean the former is closer to breaking down than the latter. There may be features of the Eritrean regime that will allow it to weather those challenges and aspects of the North Korean regime that predispose it to more abrupt collapse.

In short, we shouldn’t ignore the seemingly obvious signals, but we should be careful to put them in their proper context, and the results will sometimes be counter-intuitive.

Oh, and…THIS:


Which Past Will the Future Resemble?

Forecasts derived from statistical models depend on the assumption that the future will resemble the past. For modelers, the question is: which past? The time frame(s) we choose to use when developing our models and the ways we deal with history and time in our analyses can have substantial effects on the forecasts we produce, and this is one area where theory is at least as important as coding and statistical skills.

I was reminded of this point yesterday when Chrystia Freeland tweeted a link to the New York Fed’s blog, Liberty Street Economics. There, Ging Cee Ng and Andrea Tambalotti had a post showing (literally, as in with pictures) how the initial choice of a reference period would have affected whether or not standard macroeconomic models could have seen the Great Recession coming. Their concluding paragraph nicely sums up their findings:

Our calculations suggest that the Great Recession was indeed entirely off the radar of a standard macroeconomic model estimated with data drawn exclusively from the Great Moderation [a period of exceptional economic stability running from 1984 to 2007]. By contrast, the extreme events of 2008-09 are seen as far from impossible—if unlikely—by the same model when the shocks hitting the economy are gauged using data from a longer period (third-quarter 1954 to fourth-quarter 2007). These results provide a simple quantitative illustration of the extent to which the Great Moderation, and more specifically the assumption that the tranquil environment characterizing it was permanent, might have led economists to greatly underestimate the possibility of a Great Recession.

For social scientists trying to forecast rare events in the international system–things like civil wars, coups, and mass killings–thinking about historical eras that might be exerting some gravitational pull on patterns in the data often focuses on contrasts between the Cold War and post-Cold War periods. There are a host of ways in which the causes of internal and international crises might have changed when the USSR disintegrated, not the least of them being the end of the proxy wars the rival powers often waged and the coups they sometimes endorsed or promoted.

When developing a model for forecasting, we’re tempted to restrict the analysis to the post-Cold War period with the expectation that the near future will more closely resemble the near past. But what if the two decades following the collapse of the Soviet Union turn out to be the global political equivalent of the Great Moderation? Ng and Tambolotti’s analysis suggests that models estimated from post-Cold War data only would work better as long as the moderation holds (held?), but they would start missing badly when the system shifted back to something more like its long-term state.

Efforts to forecast onsets of mass killing provide a useful example. In the plot of mass-killing onsets by year shown below, the frequency with which these events occur seems to have changed markedly in the post-Cold War period. Over the three decades between decolonization in Africa and the disintegration of the USSR, most years saw two mass-killing onsets, and only a few saw none. Since the spasm of onsets that accompanied the collapse of Communist rule in the early 1990s, however, zero has been the most common occurrence, and no years saw more than a single onset.

Mass-Killing Onsets Worldwide by Year, 1945-2011

Until 2011, that is. For the first time in almost two decades, a single year produced two mass-killing onsets, in Sudan and Syria. Two onsets are not nearly enough to claim that things have changed, but it is enough to get me thinking–especially when it’s also possible that episodes of mass killing may have begun last year in Libya, Yemen, and Egypt.

Maybe 2011 was an anomaly, an unlikely but not impossible year in an ongoing era of less frequent or less intense attacks by states on their civilian populations. Maybe, though, it marked the end of an unusually pacific run, and the world is now sliding back toward its old normal. Until the future actually happens, the best answers we can offer to that question are informed speculation.

In the meantime, modelers looking to assess risks of mass killing in the next several years have to make some practical choices. And, as Ng and Tambalotti’s analysis shows, those choices will have a real effect on the forecasts we produce.

%d bloggers like this: