Two Tidbits on Social Unrest

1. We like to tell tidy stories about why social unrest happens, and those stories usually involve themes of grievance or social injustice—things like hardship, inequality, corruption, discrimination, and political repression. One or more of those forces probably plays a role in many bouts of unrest, especially the ones that emerge from or evolve into sustained action like we’re seeing right now in Hong Kong and Ferguson.

Still, a riot over the weekend at a pumpkin festival in semi-rural Keene, New Hampshire, reminds us that you don’t need those big issues or themes to get to social unrest. According to the L.A. Times, in Keene,

Young people chucked beer cans and cups at each otherjumped off roofstore down, kicked and smashed road signsset a large fire and chanted profanitycelebrated on top of a flipped cartook selfies in front of lines of riot policegot the attention of a police helicopterchanted “U-S-A!”pushed barricades and threw a street sign at policethrew bottles at the police after the police threw tear gas, and left behind a huge mess.

Why? Who knows, but the main ingredients in this instance seem to have been youth, alcohol, numbers, and the pleasure of transgression:

The description of the scene in Keene reminded me of the riots that sometimes erupt in college towns and sports-mad cities after big games, some of which have proven extremely destructive. These riots differ qualitatively from the rallies, marches, sit-ins, and the like that social scientists generally study. For two things, they usually aren’t planned in advance, and the participants aren’t making political claims. Still, I think our understanding of those ostensibly more political forms of collective action suffers when we make our causal narratives too tidy and ignore the forces that also produce these other kinds of outbursts.

2. Contagion is one of those forces that seems to operate across many forms of unrest. We’re sure that’s true, but we still don’t understand very well how that process works. Observers often use dominoes as a metaphor for contagion, implying that a given unit must fall in order for the cascade to pass through it.

A new paper on arXiv proposes another mechanism that allows the impulse to “hop” some units—in other words, to pass through them without producing the same type of event or effect. Instead of dominoes, contagion might work more like a virus that some people can catch and transmit without ever becoming symptomatic themselves. The authors think this mechanism could help to explain the timing and sequencing of protests in the Arab Spring:

In models of protests and revolutions, populations can have two stable equilibria—the size of the protest is either large or negligibly small—because of strategic complementarities (protest becomes more attractive as more people protest). During the Arab Spring, each country had unique grievances and agendas, and we hypothesize that each country had a unique proximity to a tipping point beyond which people would protest. Once protests began in one country (Tunisia), inspiration to protest spread to other countries via traditional media (such as newspapers) and via social media (such as Twitter and Facebook). This cross-border communication spread strategies for successful uprisings, and it increased expectations for success. Consequently, the uprisings began within a short window of time, seemingly cascading among countries more quickly than earlier revolutions did.

In coarse-grained data on the number of Facebook friendships between countries, we find evidence of the “cascade hopping” phenomenon described above. In particular, Saudi Arabia and Egypt appear to play the role of an intermediate country Y that propagated influence to protest from protesting countries to non-protesting countries, thereby helping to trigger protest in the latter countries, without themselves protesting until much later. Attributes of these intermediate countries and of the countries that they may have influenced to protest suggest that protests first spread to countries close to their tipping points (high unemployment and economic inequality) and strongly coupled to other countries via social media (measured as high Internet penetration). By contrast, we find that traditional measures of susceptibility to protest, such as political freedoms and food price indices, could not predict the order in which protests began.

As with the structural and dynamic stuff discussed around this weekend’s riot in Keene, this hopping mechanism will never be the only force at work in any instance of social unrest. Even so, it’s a useful addition to the set of processes we ought to consider whenever we try to explain or predict where and when other instances might happen.

Forecasting Round-up No. 8

1. The latest Chronicle of Higher Education includes a piece on forecasting international affairs (here) by Beth McMurtrie, who asserts that

Forecasting is undergoing a revolution, driven by digitized data, government money, new ways to analyze information, and discoveries about how to get the best forecasts out of people.

The article covers terrain that is familiar to anyone working in this field, but I think it gives a solid overview of the current landscape. (Disclosure: I’m quoted in the piece, and it describes several research projects for which I have done or now do paid work.)

2. Yesterday, I discovered a new R package that looks to be very useful for evaluating and comparing forecasts. It’s called ‘scoring‘, and it does just that, providing functions to implement an array of proper scoring rules for probabilistic predictions of binary and categorical outcomes. The rules themselves are nicely discussed in a 2013 publication co-authored by the package’s creator, Ed Merkle, and Mark Steyvers. Those rules and a number of others are also discussed in a paper by Patrick Brandt, John Freeman, and Phil Schrodt that appeared in the International Journal of Forecasting last year (earlier ungated version here).

I found the package because I was trying to break the habit of always using the area under the ROC curve, or AUC score, to evaluate and compare the accuracy of forecasts from statistical models of rare events. AUC is quite useful as far as it goes, but it doesn’t address all aspects of forecast accuracy we might care about. Mathematically, the AUC score represents the probability that a prediction selected at random from the set of cases that had an event of interest (e.g., a coup attempt or civil-war onset) will be larger than a prediction selected at random from the set of cases that didn’t. In other words, AUC deals strictly in relative ranking and tells us nothing about calibration.

This came up in my work this week when I tried to compare out-of-sample estimates from three machine-learning algorithms—kernel-based regularized least squares (KRLS), Random Forests (RF), and support vector machines (SVM)—trained on and then applied to the same variables and data. In five-fold cross-validation, the three algorithms produced similar AUC scores, but histograms of the out-of-sample estimates showed much less variance for KRLS than RF and SVM. The mean out-of-sample “forecast” from all three was about 0.009, the base rate for the event, but the maximum for KRLS was only about 0.01, compared with maxes in the 0.4s and 0.7s for the others. It turned out that KRLS was doing about as well at rank ordering the cases as RF and SVM, but it was much more conservative in estimating the likelihood of an event. To consider that difference in my comparisons, I needed to apply scoring rules that were sensitive to forecast calibration and my particular concern with avoiding false negatives, and Merkle’s ‘scoring’ package gave me the functions I needed to do that. (More on the results some other time.)

3. Last week, Andreas Beger wrote a great post for the WardLab blog, Predictive Heuristics, cogently explaining why event data is so important to improving forecasts of political crises:

To predict something that changes…you need predictors that change.

That sounds obvious, and in one sense it is. As Beger describes, though, most of the models political scientists have built so far have used slow-changing country-year data to try to anticipate not just where but also when crises like coup attempts or civil-war onsets will occur. Some of those models are very good at the “where” part, but, unsurprisingly, none of them does so hot on the “when” part. Beger explains why that’s true and how new data on political events can help us fix that.

4. Finally, Chris Blattman, Rob Blair, and Alexandra Hartman have posted a new working paper on predicting violence at the local level in “fragile” states. As they describe in their abstract,

We use forecasting models and new data from 242 Liberian communities to show that it is to possible to predict outbreaks of local violence with high sensitivity and moderate accuracy, even with limited data. We train our models to predict communal and criminal violence in 2010 using risk factors measured in 2008. We compare predictions to actual violence in 2012 and find that up to 88% of all violence is correctly predicted. True positives come at the cost of many false positives, giving overall accuracy between 33% and 50%.

The patterns Blattman and Blair describe in that last sentence are related to what Beger was talking about with cross-national forecasting. Blattman, Blair, and Hartman’s models run on survey data and some other structural measures describing conditions in a sample of Liberian localities. Their predictive algorithms were derived from a single time step: inputs from 2008 and observations of violence from 2010. When those algorithms are applied to data from 2010 to predict violence in 2012, they do okay—not great, but “[similar] to some of the earliest prediction efforts at the cross-national level.” As the authors say, to do much better at this task, we’re going to need more and more dynamic data covering a wider range of cases.

Whatever the results, I think it’s great that the authors are trying to forecast at all. Even better, they make explicit the connections they see between theory building, data collection, data exploration, and prediction. On that subject, the authors get the last word:

However important deductive hypothesis testing remains, there is much to gain from inductive, data-driven approaches as well. Conflict is a complex phenomenon with many potential risk factors, and it is rarely possible to adjudicate between them on ex ante theoretical grounds. As datasets on local violence proliferate, it may be more fruitful to (on occasion) let the data decide. Agnosticism may help focus attention on the dependent variable and illuminate substantively and statistically significant relationships that the analyst would not have otherwise detected. This does not mean running “kitchen sink” regressions, but rather seeking models that produce consistent, interpretable results in high dimensions and (at the same time) improve predictive power. Unexpected correlations, if robust, provide puzzles and stylized facts for future theories to explain, and thus generate important new avenues of research. Forecasting can be an important tool in inductive theory-building in an area as poorly understood as local violence.

Finally, testing the predictive power of exogenous, statistically significant causes of violence can tell us much about their substantive significance—a quantity too often ignored in the comparative politics and international relations literature. A causal model that cannot generate predictions with some reasonable degree of accuracy is not in fact a causal model at all.

Meanwhile, In the Lives of Hundreds of Millions of Asians…

While our social-media feeds and cable-news crawls were inundating us with news of the latest bombing, beheading, armed clash, plane crash, and viral epidemic, this was happening, too:

Rural wages are rising across much of Asia, and in some cases have accelerated since the mid 2000s. And they are doing so fast (and getting faster)… Doubling in China in the last decade, tripling or quadrupling in Vietnam. A bit slower in Bangladesh, but still up by half. This really matters because landless rural people are bottom of the heap (72% of Asia’s extreme poor are rural—some 687m people in 2008), so what they can pick up from their casual labour is a key determinant of poverty, or the lack of it. Steve argues that if the trend continues (and it looks like it will) this spells ‘the end of mass (extreme) poverty in Asia’.

That quote comes from a recent post by Duncan Green for his From Poverty to Power blog. The emphasis is mine. The Steve referenced in the last line is economist Steve Wiggins, co-author with Sharada Keats of a new report, on rural wages in Asia, from which those findings flow.

The good news from this report doesn’t stop in Asia. As Green also summarizes, higher rural wages in many Asian countries are driving up wages from manufacturing and increasing the costs of food production. Those trends should help tilt comparative advantage in food production and low-wage manufacturing toward Africa and lower-income parts of Asia. As that happens, the prospects for similar transformations occurring in those areas should improve, too.

There is no shortage of catastrophes in the world right now, and climate change runs under the whole thing like a fault line that’s started trembling with peak intensity and consequences still unknown. Meanwhile, though, most people in most parts of the world are quietly going about the business of trying to make their own lives a little bit better. And, apparently, many of them are succeeding. We shouldn’t let the incessant flow of bad news obscure our view of the larger system. This report is yet another indication that, at that level, some important things are still trending positive in spite of all the terrible things we more easily observe.

Thoughts on the Power of Civil Resistance

“People power” is a blunt and in some ways soft instrument. Activists engaged in mass protest are usually seeking formal changes in the rules or leadership of organizations to which they do not belong or in which their votes are not counted. Unfortunately for them, there is no clear or direct mechanism for converting the energy of the street into the production of those changes.

Once nonviolent action begins, however, state repression becomes a blunt instrument, too. The varied and often discreet routines states use to prevent challenges from emerging become mostly irrelevant. Instead, states must switch to a repertoire of clumsier and less familiar actions with larger and more immediate consequences.

The awkwardness of this response turns out to be the mechanism that converts people power into change, or at least the possibility for it. States thrive on routines around which they can build bureaucracies and normalize public expectations. Activists who succeed at mobilizing and sustaining mass challenges force the state onto less familiar footing, where those bureaucracies’ routines don’t apply and public expectations are weakly formed. In so doing, activists instill uncertainty in the minds of officials who must respond and of the observers of these interactions.

Responses to that uncertainty don’t always break in favor of the challengers, but they can. Insiders who comfortably played supporting roles before must consider what will happen to them if the challenge succeeds and how they might shape that future in their own favor. Other observers, foreign and domestic, may become newly energized or at least sympathetic, and even small alterations in the behaviors of those individuals can accumulate into large changes in the behavior of the public writ large. Importantly, these responses are more likely to break in favor of the challengers when those challengers manage to sustain nonviolence, even in the face of state repression.

Activists cannot control the reactions catalyzed by this uncertainty, but neither can the state. The result is an opportunity, a roll of the dice that would not have happened in the absence of the public challenge. And, really, that’s the point. That opportunity is not a sufficient condition for deep change, but it is a necessary one, and it almost never arises without a provocation.

How the Umbrella Revolution Could Win

I’m watching Hong Kong’s “umbrella revolution” from afar and wondering how an assemblage of unarmed students and professionals might succeed in wresting change from a dictatorship that has consistently and ruthlessly repressed other challenges to its authority for decades. I have already said that I expect the state to repress again and think it unlikely that China’s Communist regime will bend sharply or break in response to this particular challenge at this particular moment. But unlikely doesn’t mean impossible, and, like many observers, I hope for something better.

How could something better happen? For me, Kurt Schock’s Unarmed Insurrections remains the single most-useful source on this topic. In that 2005 book, Schock compares successful and failed “people power” movements from the late twentieth century to try to identify patterns that distinguish the former from the latter. Schock clearly sympathizes with the nonviolent protesters whose actions he describes, but that sympathy seems to motivate him to make his analysis as rigorous as possible in hopes of learning something that might inform future movements.

Schock’s overarching conclusion is that structure is not destiny—that movement participants can improve their odds of success through the strategies and tactics they choose. In this he echoes the findings of his mentors, who argued in a 1996 book (p. 15) that “movements may be largely born of environmental opportunities, but their fate is heavily shaped by their own actions.” Schock’s theoretical framework is also openly influenced by the pragmatic advocacy of Gene Sharp, but his analysis confirms the basic tenets of that approach.

So, which strategies and tactics improve the odds of movement success? On this, Schock writes (p. 143, emphasis added):

The trajectories of unarmed insurrections are shaped by the extent to which interactions between challengers, the state, and third parties produce shifts in the balance of power. The probability that an unarmed insurrection will tip the balance of power in favor of the challengers is a function of its resilience and leverage. By remaining resilient in the face of repression and effecting the withdrawal of support from or pressure against the state through its dependence relations, the state’s capacity to rule may be diminished, third-party support for the movement may be mobilized, and the coherence of the political or military elite may fracture, that is, the political context may be recast to one more favorable to the challenge.

Resilience refers to the movement’s capacity to keep mobilizing and acting in the face of attempts to repress or disperse it. Leverage refers to the movement’s ability to get constituencies on whose support the regime depends—security forces, local business and political leaders, labor groups, sometimes foreign governments and markets—to support their cause, either directly, through participation or the provision of other resources, or indirectly, through pressure on the regime to reform or concede.

On makes movements resilient, Schock’s analysis points (p. 143) to “decentralized yet coordinated organizational networks, the ability to implement multiple actions from across the three methods of nonviolent action [protest and persuasion, noncooperation, and nonviolent intervention], the ability to implement methods of dispersion as well as methods of concentration, and tactical innovation.”

Schock concludes his study with a list of six lessons that nonviolent challengers might draw from successes of the past about how to improve their own odds of success. Paraphrased and summarized, those six lessons are:

  • Set clear and limited goals. “The goals of movements should be well chosen, clearly defined, and understood by all parties to the conflict. The goals should be compelling and vital to the interests of the challenging group, and they should attract the widest possible support, both within society and externally… Precise goals give direction to the power activated by a movement and inhibit the dispersion of mobilized energies and resources.”
  • Adopt oppositional consciousness and build temporary organizations. “Oppositional consciousness is open-ended, nontotalizing, and respectful of diversity, and it facilitates the mobilization of a broad-based opposition.” Oppositional consciousness also “rejects permanent, centralized organizations and vanguard parties, opting for united front politics, shifting alliances, and temporary organizations that engage in struggles as situations arise.”
  • Engage in multiple channels of resistance. Here, Schock focuses on the value of pairing actions through institutional (e.g., elections) and non-institutional (e.g., street demonstrations) channels. In other words, attack on as many fronts as possible.
  • Employ multiple methods of nonviolent action. “Struggles for political change should not depend on a single event, however momentous, but rather should focus on the process of shifting the balance of political power through a range of mutually supporting actions over time.”
  • Act in multiple spaces and places of resistance. In addition to public rallies and demonstrations, activists can employ methods of non-cooperation (e.g., strikes and boycotts) and try to create “liberated areas” outside the state’s control. (Nowadays, these areas might exist online as well as in physical space.)
  • Communicate. “Communication among the challengers, accurate public knowledge about the movement, and international media coverage all increase the likelihood of success.”

Looking at the umbrella revolution through that lens, I’d say it is doing all of these things already—self-consciously, I would guess—and those actions seem to be having the desired effects of expanding local and international support for their movement and improving its resilience. Just today, the movement reiterated an ambitious but clear and limited set of goals that are positive and broadly appealing. Activists are working cooperatively through an array of organizations. They have built communications networks that are designed to withstand all but the most draconian attempts to shut them down. Participants are using the internet to spread knowledge about their movement, and a bevy of foreign reporters in Hong Kong are amplifying that message. The possible exception comes in the limited range of actions the movement is using. At the moment, the challenge seems to be heavily invested in the occupation of public spaces. That may change, however, as the movement persists or if and when it is confronted with even harsher repression.

More important, this uprising was not born last Friday. The longer arc of this challenge includes a much wider array of methods and spaces, including this summer’s referendum and the marches and actions of political and business elites that accompanied and surrounded them. As Jeff Wasserstrom described in a recent interview with Vox, the Occupy Central movement also connects to a longer history of pro-democracy dissent in Hong Kong under Beijing’s rule and beyond. In other words, this movement is much bigger and more deeply rooted than the occupations we’re witnessing right now, and it has already proved resilient to repeated attempts to quash it.

As Schock and Sharp and many others would argue, those shrewd choices and that resiliency do not ensure success, but they should improve prospects for it. Based on patterns from similar moments around the world in recent decades and the Communist Party of China’s demonstrated intolerance for popular challenges, I continue to anticipate that the ongoing occupations will soon face even harsher attempts to repress them than the relatively modest ones we saw last weekend. Perhaps that won’t happen, though, and if it does, I am optimistic that the larger movement will survive that response and eventually realize its goals, hopefully sooner rather than later.

Occupy Central and the Rising Risk of New Mass Atrocities in China

This is a cross-post from the blog of the Early Warning Project, which I currently direct. The Early Warning Project concentrates on risks of mass atrocities, but this post also draws on my longstanding interest in democratization and social unrest, so I thought I would share it here as well.

Activists have massed by the thousands in central Hong Kong for the past several days in defiance of repeated attempts to disperse them and menacing words from Beijing. This demonstration and the wider Occupy Central movement from which it draws poses one of the sharpest public challenges to Communist Party authority since the Tiananmen Square uprising 25 years ago. In so doing, it clearly raises the risk of a new mass atrocities in China.

Photo credit: AP via BBC News

Photo credit: AP via BBC News

The demonstrations underway now are really just the latest surge in a wave of activism that began in Hong Kong earlier this year. Under the “one country, two systems” framework to which China committed when it regained sovereignty over the then–UK colony in 1997, Hong Kong is supposed to enjoy a great deal of autonomy over local governance. This summer, however, Beijing issued a white paper affirming the central government’s “comprehensive jurisdiction” over Hong Kong, and it blocked plans for open nominations in local elections due in 2017. Those actions spurred (and were spurred by) an unofficial referendum and a mass pro-democracy rally that eventually ebbed from the streets but left behind a strengthened civic movement.

The ongoing demonstrations began with a student boycott of classes a week ago, but they escalated sharply on Friday, when activists began occupying key public spaces in central Hong Kong. Police have made several forceful attempts to disperse or remove the protesters, and official channels have said publicly that Beijing “firmly opposes all illegal activities that could undermine rule of law and jeopardise ‘social tranquility'” in Hong Kong. So far, however, the occupations have proved resilient to those thrusts and threats.

Many observers are now openly wondering how this confrontation will end. For those sympathetic to the protesters, the fear is that Beijing will respond with lethal force, as it did at Tiananmen Square in 1989.

As it happens, the Early Warning Project’s statistical risk assessments do not identify China as a country at relatively high risk of state-led mass killing this year. Partly because of that, we do not currently have a question open on our opinion pool that covers this situation. (Our lone China question focuses on the risk of state-led mass atrocities targeting Uyghurs.)

If we did have a relevant question open on our opinion pool, however, I would be raising my estimate of the risk of a state-led mass killing in response to these developments. I still don’t expect that one will occur, but not because I anticipate that Beijing will concede to the protesters’ demands. Rather, I expect violent repression, but I also doubt that it will cross the 1,000-death threshold we and others use to distinguish episodes of mass killing from smaller-scale and more routine atrocities.

State-led mass killings as we define them usually occur when incumbent rulers perceive potentially existential threats to their authority. Following leading theories on the subject, our statistical analysis concentrates on armed insurgencies and coups as the forms those threats typically take. Authoritarian governments often suppress swelling demonstrations with violence as well, but those crackdowns rarely kill as many as 1,000 nonviolent protesters, who usually disperse long before that threshold is reached. Even the Tiananmen Square massacre probably fell short of this threshold, killing “only” hundreds of activists before achieving the regime’s goal of dispersing the occupation and setting an example that would frighten future dissenters.

Instead, violent state crackdowns usually push countries onto one of three other pathways before they produce more than 1,000 fatalities: 1) they succeed at breaking the uprising and essentially restore the status quo ante (e.g., China in 1989, Uzbekistan in 2005Burma in 2007, and Thailand in 2010); 2) they suppress the nonviolent challenge but, in so doing, help to spawn a violent rebellion that may or may not be met with a mass killing of its own (e.g., Syria since 2011); or 3) they catalyze splits in state security forces or civilian rulers that lead to negotiations, reforms, or regime collapse (e.g., Egypt and Tunisia in 2011). In short, nonviolent uprisings usually lose, transform, or win before the attempts to suppress them amount to what we would call a state-led mass killing.

In Hong Kong right now, the first path—successful repression—appears to be the most likely. Chinese Communist Party leaders have spoken openly in recent years about trying to learn from the mistakes that led to collapse of the Soviet Union, and the mixed signals that were sent to early risers in the USSR—some protests were repressed, but others were allowed to run their course or met with modest concessions—probably rank high on their list of things to avoid. Those Party leaders also know that activists and separatists elsewhere in China are closely watching events in Hong Kong and would probably take encouragement from anything short of a total defeat for Occupy Central. These considerations generate strong incentives to try to quash the current challenge.

In contrast, the second of those three trajectories—a transformation to violent insurgency in response to state repression—seems highly unlikely. Protesters have shown a strong commitment to nonviolence so far and have strategic as well as ideological reasons to continue to do so; after all, the People’s Liberation Army is about as formidable a foe as they come. Brutal state repression might radicalize some survivors and inspire other onlookers, but Hong Kong is a wealthy, urban enclave with minimal access to arms, so a turn toward violent rebellion would face tall structural obstacles.

The third of those trajectories also seems unlikely, albeit somewhat less so than the second. The Communist Party currently faces several profound challenges: a slowing rate of economic growth and widespread concern about a looming financial crisis; an escalating insurgency in Xinjiang; and an epidemic of local protests over pollution, to name just a few. Meanwhile, Xi Jinping’s anti-corruption campaign is creating new fissures within the country’s ruling class, and rumors of dissent within the military have swirled occasionally in the past two years as well. As I discussed in a recent blog post, consolidated single-party regimes like China’s usually weather these kinds of challenges. When they do break down, however, it almost always happens in times like these, when worried insiders start to fight among themselves and form alliances with emboldened popular challengers.

Put those considerations together, and it seems that Beijing is most likely to respond to Occupy Central with a crackdown that could be lethal but probably will not cross the 1,000-death threshold we use to distinguish episodes of mass killing from more routine political violence. It seems less likely but still possible that the prospect or occurrence of such a crackdown will catalyze the kinds of elite splits that could finally produce significant political reform or sustained instability in China. Under none of these circumstances would I expect the challenge in Hong Kong to evolve into an armed rebellion that might produce a new wave of atrocities of its own.

No matter what the immediate outcome, though, it seems increasingly clear that China has entered a period of “thickened history,” as Marc Beissinger calls it, in which national politics will remain more eventful and less certain for some time to come.

Why political scientists should predict

Last week, Hans Noel wrote a post for Mischiefs of Faction provocatively titled “Stop trying to predict the future“. I say provocatively because, if I read the post correctly, Noel’s argument deliberately refutes his own headline. Noel wasn’t making a case against forecasting. Rather, he was arguing in favor of forecasting, as long as it’s done in service of social-scientific objectives.

If that’s right, then I largely agree with Noel’s argument and would restate it as follows. Political scientists shouldn’t get sucked into bickering with their colleagues over small differences in forecast accuracy around single events, because those differences will rarely contain enough information for us to learn much from them. Instead, we should take prediction seriously as a means of testing competing theories by doing two things.

First, we should build forecasting models that clearly represent contrasting sets of beliefs about the causes and precursors of the things we’re trying to predict. In Noel’s example, U.S. election forecasts are only scientifically interesting in so far as they come from models that instantiate different beliefs about why Americans vote like they do. If, for example, a model that incorporates information about trends in unemployment consistently produces more accurate forecasts than a very similar model that doesn’t, then we can strengthen our confidence that trends in unemployment shape voter behavior. If all the predictive models use only the same inputs—polls, for example—we don’t leave ourselves much room to learn about theories from them.

In my work for the Early Warning Project, I have tried to follow this principle by organizing our multi-model ensemble around a pair of models that represent overlapping but distinct ideas about the origins of state-led mass killing. One model focuses on the characteristics of the political regimes that might perpetrate this kind of violence, while another focuses on the circumstances in which those regimes might find themselves. These models embody competing claims about why states kill, so a comparison of their predictive accuracy will give us a chance to learn something about the relative explanatory power of those competing claims. Most of the current work on forecasting U.S. elections follows this principle too, by the way, even if that’s not what gets emphasized in media coverage of their work.

Second, we should only really compare the predictive power of those models across multiple events or a longer time span, where we can be more confident that observed differences in accuracy are meaningful. This is basic statistics. The smaller the sample, the less confident we can be that it is representative of the underlying distribution(s) from which it was drawn. If we declare victory or failure in response to just one or a few bits of feedback, we risk “correcting” for an unlikely draw that dimly reflects the processes that really interest us. Instead, we should let the models run for a while before chucking or tweaking them, or at least leave the initial version running while trying out alternatives.

Admittedly, this can be hard to do in practice, especially when the events of interest are rare. All of the applied forecasters I know—myself included—are tinkerers by nature, so it’s difficult for us to find the patience that second step requires. With U.S. elections, forecasters also know that they only get one shot every two or four years, and that most people won’t hear anything about their work beyond a topline summary that reads like a racing form from the horse track. If you’re at all competitive—and anyone doing this work probably is—it’s hard not to respond to that incentive. With the Early Warning Project, I worry about having a salient “miss” early in the system’s lifespan that encourages doubters to dismiss the work before we’ve really had a chance to assess its reliability and value. We can be patient, but if our intended audiences aren’t too, then the system could fail to get the traction it deserves.

Difficult doesn’t mean impossible, however, and I’m optimistic that political scientists will increasingly use forecasting in service of their search for more useful and more powerful theories. Journal articles that take this idea seriously are still rare birds, especially on things other than U.S. elections, but you occasionally spot them (Exhibit A and B). As Drew Linzer tweeted in response to Noel’s post, “Arguing over [predictive] models is arguing over assumptions, which is arguing over theories. This is exactly what [political science] should be doing.”

Machine learning our way to better early warning on mass atrocities

For the past couple of years, I’ve been helping build a system that uses statistics and expert crowds to assess and track risks of mass atrocities around the world. Recently dubbed the Early Warning Project (EWP), this effort already has a blog up and running (here), and the EWP should finally be able to launch a more extensive public website within the next several weeks.

One of the first things I did for the project, back in 2012, was to develop a set of statistical models that assess risks of onsets of state-led mass killing in countries worldwide, the type of mass atrocities for which we have the most theory and data. Consistent with the idea that the EWP will strive to keep improving on what it does as new data, methods, and ideas become available, that piece of the system has continued to evolve over the ensuing couple of years.

You can find the first two versions of that statistical tool here and here. The latest iteration—recently codified in new-and-improved replication materials—has performed pretty well, correctly identifying the few countries that have seen onsets of state-led mass killing in the past couple of years as relatively high-risk cases before those onsets occurred. It’s not nearly as precise as we’d like—I usually apply the phrase “relatively high-risk” to the Top 30, and we’ll only see one or two events in most years—but that level of imprecision is par for the course when forecasting rare and complex political crises like these.

Of course, a solid performance so far doesn’t mean that we can’t or shouldn’t try to do even better. Last week, I finally got around to applying a couple of widely used machine learning techniques to our data to see how those techniques might perform relative to the set of models we’re using now. Our statistical risk assessments come not from a single model but from a small collection of them—a “multi-model ensemble” in applied forecasting jargon—because these collections of models usually produce more accurate forecasts than any single one can. Our current ensemble mixes two logistic regression models, each representing a different line of thinking about the origins of mass killing, with one machine-learning algorithm—Random Forests—that gets applied to all of the variables used by those theory-specific models. In cross-validation, the Random Forests forecasts handily beat the two logistic regression models, but, as is often the case, the average of the forecasts from all three does even better.

Inspired by the success of Random Forests in our current risk assessments and by the power of machine learning in another project on which I’m working, I decided last week to apply two more machine learning methods to this task: support vector machines (SVM) and the k-nearest neighbors (KNN) algorithm. I won’t explain the two techniques in any detail here; you can find good explanations elsewhere on the internet (see here and here, for example), and, frankly, I don’t understand the methods deeply enough to explain them any better.

What I will happily report is that one of the two techniques, SVM, appears to perform our forecasting task about as well as Random Forests. In five-fold cross-validation, both SVM and Random Forests both produced areas under the ROC curve (a.k.a. AUC scores) in the mid-0.80s. AUC scores range from 0.5 to 1, and a score in the mid-0.80s is pretty good for out-of-sample accuracy on this kind of forecasting problem. What’s more, when I averaged the estimates for each case from SVM and Random Forests, I got AUC scores in the mid– to upper 0.80s. That’s several points better than our current ensemble, which combines Random Forests with those logistic regression models.

By contrast, KNN did quite poorly, hovering close to the 0.5 mark that we would get with randomly generated probabilities. Still, success in one of the two experiments is pretty exciting. We don’t have a lot of forecasts to combine right now, so adding even a single high-quality model to the mix could produce real gains.

Mind you, this wasn’t a push-button operation. For one thing, I had to rework my code to handle missing data in a different way—not because SVM handles missing data differently from Random Forests, but because the functions I was using to implement the techniques do. (N.B. All of this work was done in R. I used ‘ksvm’ from the kernlab package for SVM and ‘knn3′ from the caret package for KNN.) I also got poor results from SVM in my initial implementation, which used the default settings for all of the relevant parameters. It took some iterating to discover that the Laplacian kernel significantly improved the algorithm’s performance, and that tinkering with the other flexible parameters (sigma and C for the Laplacian kernel in ksvm) had no effect or made things worse.

I also suspect that the performance of KNN would improve with more effort. To keep the comparison simple, I gave all three algorithms the same set of features and observations. As it happens, though, Random Forests and SVMs are less prone to over-fitting than KNN, which has a harder time separating the signal from the noise when irrelevant features are included. The feature set I chose probably includes some things that don’t add any predictive power, and their inclusion may be obscuring the patterns that do lie in those data. In the next go-round, I would start the KNN algorithm with the small set of features in whose predictive power I’m most confident, see if that works better, and try expanding from there. I would also experiment with different values of k, which I locked in at 5 for this exercise.

It’s tempting to spin the story of this exercise as a human vs. machine parable in which newfangled software and Big Data outdo models hand-crafted by scholars wedded to overly simple stories about the origins of mass atrocities. It’s tempting, but it would also be wrong on a couple of crucial points.

First, this is still small data. Machine learning refers to a class of analytic methods, not the amount of data involved. Here, I am working with the same country-year data set covering the world from the 1940s to the present that I have used in previous iterations of this exercise. This data set contains fewer than 10,000 observations on scores of variables and takes up about as much space on my hard drive as a Beethoven symphony. In the future, I’d like to experiment with newer and larger data sets at different levels of aggregation, but that’s not what I’m doing now, mostly because those newer and larger data sets still don’t cover enough time and space to be useful in the analysis of such rare events.

Second and more important, theory still pervades this process. Scholars’ beliefs about what causes and presages mass killing have guided my decisions about what variables to include in this analysis and, in many cases, how those variables were originally measured and the fact that data even exist on them at all. Those data-generating and variable-selection processes, and all of the expertise they encapsulate, are essential to these models’ forecasting power. In principle, machine learning could be applied to a much wider set of features, and perhaps we’ll try that some time, too. With events as rare as onsets of state-led mass killing, however, I would not have much confidence that results from a theoretically agnostic search would add real forecasting power and not just result in over-fitting.

In any case, based on these results, I will probably incorporate SVM into the next iteration of the Early Warning Project’s statistical risk assessments. Those are due out early in the spring of 2015, when all of the requisite inputs will have been updated (we hope). I think we’ll also need to think carefully about whether or not to keep those logistic regression models in the mix, and what else we might borrow from the world of machine learning. In the meantime, I’ve enjoyed getting to try out some new techniques on data I know well, where it’s a lot easier to tell if things are going wonky, and it’s encouraging to see that we can continue to get better at this hard task if we keep trying.

No, Pope Francis, this is not World War Three

In the homily to a mass given this morning in Italy, at a monument to 100,000 soldiers killed in World War I, Pope Francis said:

War is madness… Even today, after the second failure of another world war, perhaps one can speak of a third war, one fought piecemeal, with crimes, massacres, destruction.

There are a lot of awful things happening around the world, and I appreciate the pope’s advocacy for peace, but this comparison goes too far. Take a look at this chart of battle deaths from armed conflict around the world from 1900 to 2005, from a study by the Peace Research Institute of Oslo:

The chart doesn’t include the past decade, but we don’t need all the numbers in one place to see what a stretch this comparison is. Take Syria’s civil war, which has probably killed more than 150,000 (source) and perhaps as many as 300,000 or more people over the past three years, for an annual death rate of 50,000–100,000. That is a horrifying toll, but it is vastly lower than the annual rates in the several millions that occurred during the World Wars. Put another way, World War II was like 40 to 80 Syrian civil wars at once.

The many other wars of the present do not substantially close this gap. The civil war in Ukraine has killed approximately 3,000 so far (source). More than 2,000 people have died in the fighting associated with Israel’s Operation Protective Edge in Gaza this year (source). The resurgent civil war in Iraq dwarfs them both but still remains well below the intensity of the (interconnected) war next door (source). There are more than 20 other armed conflicts ongoing around the world, but most of them are much less lethal than the ones in Syria and Iraq, and their cumulative toll does not even begin to approach the ones that occurred in the World Wars (source).

I sympathize with the Pope’s intentions, but I don’t think that hyperbole is the best way to realize them. Of course, Pope Francis is not alone; we’ve been hearing a lot of this lately. I wonder if violence on the scale of the World Wars now lies so far outside of our lived experience that we simply cannot fathom it. Beyond some level of disorder, things simply become terrible, and all terrible things are alike. I also worry that the fear this apparent availability cascade is producing will drive other governments to react in ways that only make things worse.

The era of democratization is not over

In the latest issue of the Journal of Democracy, (PDF), Marc Plattner makes the provocative claim that “the era of democratic transitions is over, and should now become the province of the historians.” By that, he seems to mean that we should not expect new waves of democratization similar in form and scale to the ones that have occurred before. I think Plattner is wrong, in part because he has defined “wave” too broadly. If we tighten up that concept a bit, I think we can see at least a few possibilities for new waves in the not-too-distant future, and thus an extension of the now–long-running era of democratization.

In his essay, Plattner implicitly adopts the definition of waves of democratization described by Samuel Huntington on p. 15 of his influential 1991 book:

A wave of democratization is a group of transitions from nondemocratic to democratic regimes that occur within a specified period of time and that significantly outnumber transitions in the opposite direction during that period of time.

Much of what’s been written and said about waves of democratization since that book was published accepts those terms and the three waves Huntington identifies when he applies them to the historical evidence: one in Europe from the 1820s to the 1920s; another and wider one in Europe, Latin America, and Asia from the 1940s to the early 1960s; and a third and so-far final one that began in Portugal in 1974, has been global in scope, and now appears to have stalled or ended.

I find Huntington’s definition and resulting periodization wanting because they focus on the what and don’t pay enough attention to the why. A large number of transitions might occur around the same time because they share common underlying causes; because they cause and reinforce each other; or as a matter of chance, when independent events just happen to cluster. The third possibility is not scientifically interesting (cf. the Texas sharpshooter fallacy). More relevant here, though, I think the first two become banal if we let the time lag or chain of causality stretch too far. We inhabit a global system; at some level, everything causes, and is caused by, everything else. For the wave idea to be scientifically useful, we have to restrict its use to clusters of transitions that share common, temporally proximate causes and/or directly cause and reinforce each other.

By that definition, I think we can make out at least five and maybe more such waves since the early 1900s, not the three or maybe four we usually hear about.

First, as Plattner  (p. 9) points out, what Huntington describes as the “first, long” wave really includes two distinct clusters: 1) the “dozen or so European and European-settler countries that already had succeeded in establishing a fair degree of freedom and rule of law, and then moved into the democratic column by gradually extending the suffrage”; and 2) “countries that became democratic after World War I, many of them new nations born from the midst of the European empires defeated and destroyed during the war.”

The second (or now third?) wave grew out of World War II. Even though this wave was relatively short, it also included a few distinct sub-clusters: countries defeated in that war, countries born of decolonization, and a number of Latin American cases. This wave is more coherent, in that all of these sub-clusters were at least partially nudged along by the war’s dynamics and outcomes. It wouldn’t be unreasonable to split the so-called second wave into two clusters (war losers and newly independent states) and a clump of coincidences (Latin America), but there are enough direct linkages across those sets to see meaning in a larger wave, too.

As for the so-called third wave, I’m with Mike McFaul (here) and others who see at least two separate clusters in there. The wave of democratization that swept southern Europe and Latin America in the 1970s and early 1980s is temporally and causally distinct from the spate of transitions associated with the USSR’s reform and disintegration, so it makes no sense to talk of a coherent era spanning the past 40 years. Less clear is where to put the many democratic transitions—some successful, many others aborted or short lived—that occurred in Africa as Communist rule collapsed. Based partly on Robert Bates’ analysis (here), I am comfortable grouping them with the post-Communist cases. Trends in the global economy and the disappearance of the USSR as a patron state directly affected many of these countries, and political and social linkages within and across these regional sets also helped to make democratization contagious once it started.

So, based on that definition and its application, I think it’s fair to say that we have seen at least five waves of democratization in the past two centuries, and perhaps as many as six or seven.

Given that definition, I think it’s also easier to see possibilities for new waves, or “clusters” if we want to make clearer the distinction from conventional usage. Of course, the probability of any new waves is partially diminished by the success of the earlier ones. Nearly two-thirds of the world’s countries now have regimes that most observers would call democratic, so the pool of potential democratizers is substantially diminished. As Plattner puts it (p. 14), “The ‘low-hanging fruit’ has been picked.” Still, if we look for groups of authoritarian regimes that share enough political, economic, social, and cultural connections to allow common causes and contagion to kick in, then I think we can find some sets in which this dynamic could clearly happen again. I see three in particular.

The first and most obvious is in the Middle East and North Africa, the region that has proved most resistant to democratization to date. In fact, I think we already saw—or, arguably, are still seeing—the next wave of democratization in the form of the Arab Spring and its aftermath. So far, that cluster of popular uprisings and state collapses has only produced one persistently democratic state (Tunisia), but it has also produced a democratic interlude in Egypt; a series of competitively elected (albeit ineffective) governments in Libya; a nonviolent transfer of power between elected governments in Iraq; ongoing (albeit not particularly liberal) revolutions in Syria and Yemen; and sustained, liberal challenges to authoritarian rule in Bahrain, Kuwait, and, perhaps, Saudi Arabia. In other words, a lot of countries are involved, and it ain’t over yet. Most of the Soviet successor states never really made it all the way to democracy, but we still think of them as an important cluster of attempts at democratization. I think the Arab Spring fits the same mold.

Beyond that, though, I also see the possibility of a wave of regime breakdowns and attempts at democracy in Asia brought on by economic or political instability in China. Many of the autocracies that remain in that region—and there are many—depend directly or indirectly on Chinese patronage and trade, so any significant disruption in China’s political economy would send shock waves through their systems as well. I happen to think that systemic instability will probably hit China in the next few years (see here, here, and here), but the timing is less relevant here than the possibility of this turbulence, and thus of the wider wave of democratization it could help to produce.

Last and probably least in its scope and impact, I think we can also imagine a similar cluster occurring in Eurasia in response to instability in Russia. The number of countries enmeshed in this network is smaller, but the average strength of their ties is probably similar.

I won’t hazard guesses now about the timing and outcome of the latter two possibilities beyond what I’ve already written about China’s increasing fragility. As the Arab Spring has shown, even when we can spot the stresses, it’s very hard to anticipate when they’ll overwhelm the sources of negative feedback and what form the new equilibrium will take. What I hope I have already done, though, is to demonstrate that, contra Plattner, there’s plenty of room left in the system for fresh waves of democratization. In fact, I think we even have a pretty good sense of where and how those waves are most likely to come.

Follow

Get every new post delivered to your Inbox.

Join 7,746 other followers

%d bloggers like this: