Observing and Understanding Social Unrest in Real Time

Game theoretic models of social unrest often represent governments and oppositions as unitary actors engaged in a sequence of moves involving binary choices. At any given time, an opposition can keep playing by the rules or choose to protest. If the opposition chooses to protest, the government can respond by conceding to protesters’ demands or repressing them. If the government represses, protesters can respond by dissipating or escalating. Ditto for the government on its next turn, and so on until either one side wins decisively or a bargain is struck that lets everyone get back to “normal” politics.

That class of models can and has produced important insights into the absence, occurrence, and dynamics of social unrest. At the same time, those models deliberately bracket some of the most interesting and arguably important aspects of social unrest—that is, the politics occurring within those camps. “Government” and “opposition” are shorthand for large assemblages of diverse individuals, each making his or her own choices under different circumstances and with different information. The interactions summarized in those formal models depend on—are constituted by—the actions and interactions occurring at this lower, or “micro,” level.

That micro level is harder to understand, but it’s what we actually see when we observe these eventful periods up close in real time. The ongoing occupation of parts of central Hong Kong—which, yes, is still happening, even if it has mostly fallen out of the international news stream—offers a case in point. As Chris Buckley and Alan Wong describe in today’s New York Times, protesters in Hong Kong right now are openly and self-consciously struggling to make one of those strategic choices. Here’s how Buckey and Wong describe the efforts to escalate:

Most mornings for weeks, in one of the pro-democracy protest camps here, Wong Yeung-tat has berated, mocked and goaded the government and, increasingly, the student protest leaders and democratic politicians he deems too timid.

“The occupy campaign needs to be taken to a new level,” he said in an interview. “There needs to be escalation, occupation of more areas or maybe government buildings. The campaign at this stage has become too stable”…

Mr. Wong’s organization, Civic Passion, and a tangle of like-minded groups, Internet collectives and free-floating agitators have grown impatient with the milder path supported by most protesters. They argue that only stronger action, such as new occupations, can force concessions from the Hong Kong government and the Chinese Communist Party.

Meanwhile,

Mainstream protesters fear confrontational tactics could tear the movement apart and anger ordinary residents, many already tiring of the protest camps.

“It will be difficult to narrow the differences,” said Lee Cheuk-yan, the chairman of the pro-democracy Labor Party, who has been castigated by the movement’s more zealous wing. “We have already escalated to a high point. If it would further alienate public opinion, then that’s something we don’t want to see.”

Through Buckley and Wong’s eyes, we see the participants standing at the figurative fork in the road—or, if you like, the node in the decision tree. And, as protesters argue and experiment their way toward a phase shift of one form or another, the government does the same. We usually don’t get to witness much of the government’s internal debating, but their tactical experiments are easy to spot, and Hong Kong is no exception on that front, either.

We still aren’t very good at understanding exactly how those decisions get made or predicting how the larger process will unfold. We are, however, pretty good at recognizing some of the patterns that comprise these episodes (which are themselves figments of our theoretical imaginations, but still). In fact, the dynamic unfolding in Hong Kong right now is very much like what Sidney Tarrow described in Power in Movement (p. 24):

The power to trigger sequences of collective action is not the same as the power to control or sustain them. This dilemma has both an internal and an external dimension. Internally, a good part of the power of movements comes from the fact that they activate people over whom they have no control. This power is a virtue because it allows movements to mount collective actions without possessing the resources that would be necessary to internalize a support base. But the autonomy of their supporters also disperses the movement’s power, encourages factionalism and leaves it open to defection, competition and repression.

The similarity between that description and the evolution of the unrest in Hong Kong implies that we can sketch the causal terrain with some confidence, even if we can’t reliably predict exactly how social forces will flow through it each time.

Naturally, though, we still wonder: how will it turn out? Historical base rates imply that the factions advocating more aggressive tactics probably won’t tip the larger crowd toward escalation, and even if they do, that crowd will probably fail to achieve its objectives, at least in the short term. If I had to make a prediction, I would bet that this particular episode of unrest will conclude without having achieved any of its major demands. Still, base rates aren’t destiny, and if we already knew how this was going to turn out, it probably wouldn’t be happening in the first place.

Wisdom of Crowds FTW

I’m a cyclist who rides indoors a fair amount, especially in cold or wet weather. A couple of months ago, I bought an indoor cycle with a flywheel and a power meter. For the past several years, I’d been using the kind of trainer you attach to the back wheel of your bike for basement rides. Now, though, my younger son races, so I wanted something we could both use without too much fuss, and his coach wants to see power data from his home workouts.

To train properly with a power meter, I need to benchmark my current fitness. The conventional benchmark is Functional Threshold Power (FTP), which you can estimate from your average power output over a 20-minute test. To get the best estimate, you need to go as hard as you can for the full 20 minutes. To do that, you need to pace yourself. Go out too hard and you’ll blow up partway through. Go out too easy and you’ll probably end up lowballing yourself.

Once you have an estimate of your FTP, that pacing is easy to do: just ride at the wattage you expect to average. But what do you do when you’re taking the test for the first time?

I decided to solve that problem by appealing to the wisdom of the crowd. When I ride outdoors, I often ride with the same group, and many of those guys train with power meters. That means they know me and they know power data. Basically, I had my own little panel of experts.

Early this week, I emailed that group, told them how much I weigh (about 155 lbs), and asked them to send me estimates of the wattage they thought I could hold for 20 minutes. Weight matters because power covaries with it. What the other guys observe is my speed, which is a function of power relative to weight. So, to estimate power based on observed speed, they need to know my weight, too.

I got five responses that ranged from 300 to 350. Based on findings from the Good Judgment Project, I decided to use the median of those five guesses—314—as my best estimate.

I did the test on Tuesday. After 15 minutes of easy spinning, I did 3 x 30 sec at about 300W with 30 sec easy in between, then another 2 min easy, then 3 min steady above 300W, then 7 min easy, and then I hit it. Following emailed advice from Dave Guttenplan, who sometimes rides with our group, I started out a little below my target, then ramped up my effort after about 5 min. At the halfway point, I peeked at my interval data and saw that I was averaging 310W. With 5 min to go, I tried to up the pace a bit more. With 1 min to go, I tried to dial up again and found I couldn’t go much harder. No finish-line sprint for me. When the 20-minute mark finally arrived, I hit the “interval” button, dialed the resistance down, and spent the next minute or so trying not to barf—a good sign that I’d given it just about all I had.

And guess what the final average was: 314!

Now, you might be thinking I tried to hit that number because it makes for a good story. Of course I was using the number as a guideline, but I’m as competitive as the next guy, so I was actually pretty motivated to outperform the group’s expectations. Over the last few minutes of the test, I was getting a bit cross-eyed, too, and I don’t remember checking the output very often.

This result is also partly coincidence. Even the best power meters have a margin of error of about 2 percent, and that’s assuming they’re properly calibrated. So the best I can say is that my average output from that test was probably around 314W, give or take several watts.

Still, as an applied stats guy who regularly works with “wisdom of crowds” systems, I thought this was a great illustration of those methods’ utility. In this case, the remarkable accuracy of the crowd-based estimate surely had a lot to do with the crowd’s expertise. I only got five guesses, but they came from people who know a lot about me as a rider and whose experience training with power and looking at other riders’ numbers has given them a strong feel for the distribution of these stats. If I’d asked a much bigger crowd who didn’t know me or the data, I suspect the estimate would have missed badly (like this one). Instead, I got just what I needed.

Reactions to Reflections on the Arab Uprisings

Yesterday, Marc Lynch posted a thoughtful and candid set of reflections on how political scientists who specialize in the Middle East performed as analysts and forecasters during the Arab uprisings, not before them, the subject on which most of the retrospectives have focused thus far. The background to the post is a set of memos Marc commissioned from the contributors to a volume he edited on the origins of the uprisings. As Marc summarizes, their self-criticism is tough:

We paid too much attention to the activists and not enough to the authoritarians; we understated the importance of identity politics; we assumed too quickly that successful popular uprisings would lead to a democratic transition; we under-estimated the key role of international and regional factors in domestic outcomes; we took for granted a second wave of uprisings, which thus far has yet to materialize; we understated the risk of state failure and over-stated the possibility of democratic consensus.

Social scientists and other professional analysts of world affairs should read the whole thing—if not for the specifics, then as an example of how to assess and try to learn from your own mistakes. Here, I’d like to focus on three points that jumped out at me as I read it.

The first is the power of motivated reasoning—”the unconscious tendency of individuals to process information in a manner that suits some end or goal extrinsic to the formation of accurate beliefs.” When we try to forecast politics in real time, we tend to conflate our feelings about specific events or trends with their likelihood. After noting that he and his colleagues over-predicted democratization, Marc observes:

One point that emerged in the workshop discussions is the extent to which we became too emotionally attached to particular actors or policies. Caught up in the rush of events, and often deeply identifying with our networks of friends and colleagues involved in these politics, we may have allowed hope or passion to cloud our better comparative judgment.

That pattern sounds a lot like the one I saw in my own thinking when I realized that my initial forecasts about the duration and outcome of the Syrian civil war had missed badly.

This tendency is probably ubiquitous, but it’s also one about which we can actually do something, even if we can’t eliminate it. Whenever we’re formulating an analysis or prediction, we can start by ask ourselves what result we hope to see and why, and we can think about how that desire might relate to the conclusions we’re reaching. We can try to imagine how someone with different motivations might view the same situation, or just seek out examples of those alternative views. Finally, we can weight or adjust our own analysis accordingly. Basically, we can try to replicate in our own analysis what “wisdom of crowds” systems do to great effect on a larger scale. This exercise can’t fully escape the cognitive traps to which it responds, but I think it can at least mitigate their influence.

Second, Marc’s reflections also underscore our tendency to underestimate the prevalence of inertia in politics, especially during what seem like exceptional times. As I recently wrote, our analytical eyes are drawn to the spectacular and dynamic, but on short time scales at least, continuity is the norm. Observers hoping for change in the countries touched by the Arab uprisings would have done well to remember this fact—and surely some did—when they were trying to assess how much structural change those uprisings would actually produce.

My last point concerns the power of social scientists to shape these processes as they unfold. In reflecting on his own analysis, Marc notes that he correctly saw how the absence of agreement on the basic rules of politics would complicate transitions, but he “was less successful in figuring out how to overcome these problems.” Marc aptly dubs this uncertainty Calvinball, and he concludes:

I’m more convinced than ever that moving beyond Calvinball is essential for any successful transition, but what makes a transitional constitutional design process work—or fail—needs a lot more attention.

Actually, I don’t think the problem is a lack of attention. How to escape this uncertainty in a liberal direction has been a central concern for decades now of scholarship on democratization and the field of applied democracy promotion that’s grown up alongside it. Giuseppe di Palma’s 1990 book, To Craft Democracies, remains a leading example on the kind of advocacy-cum-scholarship this field has produced, but there are countless “lesson learned” white papers and “best practices” policy briefs to go with it.

No, the real problem is that transitional periods are irreducibly fraught with the uncertainties Marc rightly spotlighted, and there simply are no deus-ex-machina resolutions to them. When scholars and practitioners do get involved, we are absorbed into the politics we mean to “correct,” and most of us aren’t nearly as adept in that field as we are in our own. After a couple of decades of closely watching these transitions and the efforts of various parties to point them in particular directions, I have come to believe that this is one of those things social science can help us understand but not “fix.”

On the Consumption of Protest Art in Real Time

Today’s New York Times carries a story describing efforts by “preservationists, historians and art lovers” to capture and share art produced by the ongoing occupations in Hong Kong:

Because most of the art is still on the streets, the archiving is largely digital. Some digital renditions and objects are already running alongside the “Disobedient Objects” exhibition at the Victoria and Albert Museum in London.

The Umbrella Movement Visual Archives and Research Collective, led partly by academics, is creating open-data platforms and Google maps to mark the locations of art pieces.

A new group—Umbrella Movement Art Preservation, or UMAP—has “rescue team members” on the ground, armed with cellphones and ready to mobilize volunteers to evacuate art on short notice. They have received offers of help from sympathetic truck drivers and about a dozen private galleries…

“It is all installation art,” said Mr. Wong of UMAP.

This process strikes me as unavoidably exploitative. The objects of this preservation campaign are art, but it is art that is meant to serve a specific and immediate political purpose. Removed from its original context and displayed online or in galleries, protest art becomes a form of found-object art. The “discovery” and display of these objects produces aesthetic and, in some cases, commercial value for its conveyors and consumers, but those returns are not shared with the original producers. Preservers, gallerists, and viewers inevitably engage in appropriation as well as appreciation.

More important, these preservation efforts give onlookers a way to enjoy the art without getting enmeshed in the politics. They treat the demonstrations as a creative performance, a kind of entertainment—”It is all installation art”—for the benefit of the viewer. In so doing, they implicitly ignore the strong political claims that this “performance” and the objects it generates are meant to produce.

The location of the original production is an essential part of its political meaning. The fact that it is confrontational and therefore dangerous to produce and display that art in those places is precisely what imbues it with any political power. By removing the art from that location, preservationists give distant onlookers a chance to enjoy the show without directly engaging in those politics. Politics is suffused with symbolic expression, but in situations like this one, the symbols are meant to serve a political purpose. When you try to separate the former from the latter, you implicitly ignore—and thus, in a fashion, reject—that purpose.

This rejection becomes less problematic, or at least less consequential, with the passage of time. When done in the moment, though, the decision to consume the aesthetic without engaging in the politics can have political consequences. “Wait, let me just move this sculpture out of the way before you smash everything to bits…” could imply that you care more about the sculpture than the people who produced it. More likely, it implies that you feel powerless to help defend those producers. I imagine that neither of those messages is particularly encouraging to the protesters or discouraging to those who would do the smashing.

I arguably engage in a related form of exploitation in my own work. My trade is explaining and forecasting political calamities that often involve substantial human suffering. To make my work more credible, I avoid public advocacy or activism on the topics and cases I study. So, I am finding and exploiting commercial value in the actions and suffering of others while adopting a public posture of indifference to that suffering. I’m not sure what to do with that fact right now, but I thought it only fair to acknowledge it in a post that scolds others for the same.

The Ghosts of Wu Chunming’s Past, Present, and Future

On a blogged recommendation from Chris Blattman, I’m now reading Factory Girls. Written by Leslie T. Chang and published in 2008, it’s a non-fiction book about the young migrant women whose labor has stoked the furnaces of China’s economic growth over the past 30 years.

One of the book’s implicit “findings” is that this migration, and the larger socioeconomic transformation of which it is a part, is a difficult but ultimately rewarding process for many. Chang writes (p. 13, emphasis in the original):

Migration is emptying villages of young people. Across the Chinese countryside, those plowing and harvesting in the fields are elderly men and women, charged with running the farm and caring for the younger children who are still in school. Money sent home by migrants is already the biggest source of wealth accumulation in rural China. Yet earning money isn’t the only reason people migrate. In surveys, migrants rank ‘seeing the world,’ ‘developing myself,’ and ‘learning new skills’ as important as increasing their incomes. In many cases, it is not crippling poverty that drives migrants out from home, but idleness. Plots of land are small and easily farmed by parents; nearby towns offer few job opportunities. There was nothing to do at home, so I went out.

That idea fits my priors, and I think there is plenty of system-level evidence to support it. Economic development carries many individual and collective costs, but the available alternatives are generally worse.

Still, as I read, I can’t help but wonder how much the impressions I take away from the book are shaped by selection bias. Like most non-fiction books written for a wide audience, Factory Girls blends reporting on specific cases—here, the experiences of certain women who have made the jump from small towns to big cities in search of paid work—with macro-level data on the systemic trends in which those cases are situated. The cases are carefully carefully and artfully reported, and it’s clear that Chang worked on and cared deeply about this project for many years.

No matter how hard the author tried, though, there’s a hitch in her research design that’s virtually impossible to overcome. Chang can only tell the stories of migrants who shared their stories with her, and these sources are not a random sample of all migrants. Even worse for attempts to generalize from those sources, there may be a correlation between the ability and desire to tell your story to a foreign reporter and the traits that make some migrants more successful than others. We don’t hear from young women who are too ashamed or humble or disinterested to tell their stories to a stranger who wants to share them with the world. We certainly can’t hear from women who have died or been successfully hidden from the reporter’s view for one reason or another. If the few sources who open up to Chang aren’t representative of the pool of young women whose lives she aims to portray, then their stories won’t be, either.

An anecdote from Wu Chunming, one of the two young women on whom the book focuses, stuck in my mind as a metaphor for the selection process that might skew our view of the process Chang means to describe. On pp. 46-47, Chang writes:

Guangdong in 1993 was even more chaotic than it is today. Migrants from the countryside flooded the streets looking for work, sleeping in bus stations and under bridges. The only way to find a job was to knock on factory doors, and Chunming and her friends were turned away from many doors before they were hired at the Guotong toy factory. Ordinary workers there made one hundred yuan a month, or about twelve dollars; to stave off hunger, they bought giant bags of instant noodles and added salt and boiling water. ‘We thought if we ever made two hundred yuan a month,’ Chunming said later, ‘we would be perfectly happy.’

After four months, Chunming jumped to another factory, but left soon after a fellow worker said her cousin knew of better jobs in Shenzhen. Chunming and a few friends traveled there, spent the night under a highway overpass, and met the girl’s cousin the next morning. He brought them to a hair salon and took them upstairs, where a heavily made-up young woman sat on a massage bed waiting for customers. Chunming was terrified at the sight. ‘I was raised very traditionally,’ she said. ‘I thought everyone in that place was bad and wanted me to be a prostitute. I thought that once I went in there, I would turn bad too.’

The girls were told that they should stay and take showers in a communal stall, but Chunming refused. She walked back down the stairs, looked out the front door, and ran, abandoning her friends and the suitcase that contained here money, a government-issued identity card, and a photograph of her mother…

‘Did you ever find out what happened to the friends you left behind in the hair salon?’ I asked.

‘No,’ she said. ‘I don’t know if it was a truly bad place or just a place where you could work as a massage girl if you wanted. But it was frightening that they would not let us leave.’

In that example, we hear Wu’s side of this story and the success that followed. What we don’t hear are the stories of the other young women who didn’t run away that day. Maybe the courage or just impulsiveness Chunming showed in that moment is something that helped her become more successful afterwards, and that also made her more likely to encounter and open up to a reporter.

Chang implicitly flags this issue for us at the end of that excerpt, and she explicitly addresses it in a “conversation” with the author that follows the text in my paperback edition. Still, Chang can’t tell us the versions of the story that she doesn’t hear. In social-scientific jargon, those other young women left behind at the hair salon are the unobserved counterfactuals to the optimistic narrative we get from Chunming. A more literary soul might describe those other girls as the ghosts of Wu Chunming’s past, present, and future. Unlike Dickens’ phantoms, though, these other lives actually happened, and yet we still can’t see them.

In a recent blog post, sociologist Zeynep Tufekci wrote about the relationship between a project’s research design and the inferences we can draw from it:

Research methods, a topic that is seemingly so dry, are the heart and soul of knowledge. Most data supports more than one theory. This does NOT mean all data supports all theories: rather, multiple explanations can fit one set of findings. Choosing the right underlying theory, an iterative process that always builds upon itself, requires thinking hard on how data selection impacts findings, and how presentation of findings lends itself to multiple theories, and how theories fit with existing worldviews, and how better research design can help us distinguish between competing explanation.

A good research project consciously grapples with these.

Like the video Tufekci critiques in her essay, Chang’s book is a research project. Factory Girls is a terrific piece of work and writing, but those of us who read it with an eye toward understanding the wider processes its stories are meant to represent should do so with caution, especially if it confirms our prior beliefs. I hope that economic development is mostly improving the lives of young women and men in China, and there is ample macro-level evidence that it is. The stories Chang relates seem to confirm that view, but a little thinking about selection effects suggests that we should expect them to do that. To really test those beliefs, we would need to trace the life courses of a wider sample of young women. As is often happens in social science, though, the cases most important to testing our mental models are also the hardest to see.

Positive Feedback Junkie

Yesterday, while grabbing a last half-cup of coffee after an event about political risk assessment, I met a guy who told me he used to work as a futures trader.

“What’s that like?” I asked him.

“Everyone’s different,” he said, and then described a few of the work routines and trading strategies he and his former colleagues had followed.

As he talked about the lifestyle, I recognized some of my own habits. Right now, I’m actively forecasting on at least five different platforms. Together, three of those—the Early Warning Project’s opinion pool, the Good Judgment Project, and Inkling’s public prediction market—cover an almost-absurd array of events and processes around the world, from political violence to trade agreements, election outcomes, and sporting contests. To try to do well on all of those platforms, I have to follow news from as many sources as I can about all kinds of places and organizations. I also forecast on this blog. Here, the prognostications are mostly annual, but they’re public, too, so the results directly affect my professional reputation. The events I forecast here are also rare, so the reputational consequences of a hit or miss will often linger for weeks or months. The fifth platform—the stock market—requires yet-another information set and involves my own real money.

One of the things the Good Judgment Project has found is that subject-matter expertise isn’t reliably associated with higher forecasting accuracy, but voraciously consuming the news and frequently updating your forecasts are. The term “information junkie” comes to mind, and I think the junkie part may be more relevant than we let on. When you’re trying to anticipate the news, there’s a physiological response, an amped-up feeling you get when events are moving quickly in a situation about which you’ve made a forecast. I recognize that cycle of lulls and rushes from a short flirtation with online, play-money poker more than a decade ago, and I sometimes get it now when a blog post gets a burst of attention. When things are slow and nothing relevant seems to be happening, there’s an edginess that persists and pulls you into searching for new information, new opportunities to forecast, new levers to push and then wait for the treat to drop. I’ve also noticed that this feeling gets amplified by Twitter. There, I can see fresh information roll by in real time, like a stock ticker for geopolitics if you follow the right mix of people. I can also chase little rushes by dropping my own tweets into the mix and then watching for retweets, responses, and favorites.

When I started college, I thought I would major in biology. I had really enjoyed math and science in high school, had done well in them, and imagined making a career out of those interests and what seemed like talents. First semester of freshman year, I took vector calculus and chemistry. I also behaved like a lot of college freshman, not working as hard as I had in high school and doing some other things that weren’t especially good for my cognitive skill and accumulation of knowledge. As the semester rolled by, I found that I wasn’t doing as well as I’d expected in those math and science classes, but I was doing very well in my social-science and Russian-language courses. After freshman year, I didn’t take another math or natural-science class in college, and I graduated three years later with a degree in comparative area studies.

Sometimes I regret my failure to chase that initial idea a little harder. When that happens, I explain that failure to myself as the result of a natural impulse to seek out and stay close to streams of positive feedback. I see the same impulse in my forecasting work, and I see it in my own and other people’s behavior on social media, too. It’s not freedom from stress we’re seeking. The absence of stress is boredom, and I don’t know anyone who can sit comfortably with that feeling for long. What I see instead is addictive behavior, the relentless chase for another hit. We’re okay with a little discomfort, as long as the possibility of the next rush hides behind it, and the rush doesn’t have to involve money to feel rewarding.

After the guy I met yesterday had described some traders’ work routines—most of which would probably sound great to people in lots of other jobs, and certainly to people without jobs—I asked him: “So why’d you leave it?”

“Got tired of always chasin’ the money,” he said.

The Inescapable Uncertainty of Popular Uprisings

On Tuesday, hundreds of thousands of people turned out in the streets of Ouagadougou to protest a plan to remove terms limits ahead of next year’s presidential election in Burkina Faso. Blaise Compaore has held that country’s top office for 27 years by way of a 1987 coup and four subsequent elections that have not been fair, and his party dominates the legislature for the same reason. Tuesday’s protests are part of a wider and ongoing wave of actions that includes a general strike and stay-aways from schools and universities. A similar wave of protests occurred over several months in 2011. The state’s efforts to repress those challenges killed several people on at least two occasions, and virtually nothing changed in their wake.

Protesters in Ouagadougou on 28 October 2014 (Photo credit: Issouf Sanogo/AFP)

So, will the latest protests in Burkina Faso coalesce into a sustained campaign, or will they soon peter out? If they do coalesce, will that campaign spur significant reform or even revolution, or will it dissipate against repression, redirection, and resistance from incumbent power-holders?

The truth is, no one really knows, and this uncertainty is not specific to Burkina Faso. After decades of thoughtful research, social scientists still can’t reliably predict which bouts of unrest will blow up into revolutions and which won’t.

We can say some useful things about which structural conditions are more conducive, and thus which cases are more susceptible, to sustained popular challenges. A study I co-piloted with Erica Chenoweth (details forthcoming) found several features that can help assess where nonviolent campaigns are more likely to emerge, but the forecasting power of models based on those features is not stellar. Efforts to develop predictive models of civil-war onset have achieved similar results.

Once unrest starts to burble, though, we still don’t understand and can’t model the ensuing process well enough to reliably predict which way it will tip. Across many cases, a simple base-rate forecast will produce very accurate results. Keep betting on the persistence of the status quo, and you’ll almost always be right. If you’re trying to predict what will happen in a specific case at a specific juncture, however, it’s still hard to improve much on that crude baseline.

This persistent uncertainty can be maddening. Lots of smart people have spent a lot of time studying and thinking about these processes, and it feels like all that effort should have yielded bigger gains in predictive power by now.

That failure is also enlightening. If we believe that our efforts to date have been thoughtful and thorough, then the lack of progress on predicting the dynamics of these situations is telling something important about the nature of the underlying process. Uncertainty isn’t just a consequence of these political behaviors; it’s a prerequisite for them. As Phil Arena said on Twitter:

And it’s not just uncertainty about the potential for harsh state repression, which is what I took Phil to mean by “violence.” Uncertainty about who else will turn out under what conditions, what forms that violence will take and exactly whom it will directly affect, how challengers will organize and adapt in response to those events, what changes in policy or institutions those actions will produce, and who will benefit or suffer how much from those changes are all relevant, too.

In short, the rare political “events” we wish to predict are shorthand for myriad interactions over time among large numbers of heterogeneous individuals who plan and learn and screw up in a changing environment in which information is inevitably incomplete and imperfect. The results are not random, but they are complex, in both the conventional and scientific sense of that term. If we could reliably foresee how things were going to go, then we would adapt our behavior accordingly, and the whole thing would unravel before it even started.

Under these conditions, central tendencies can and do still emerge. A small but growing body of work in political science shows that we can use structural patterns and observations of leading-edge activities to smudge base-rate forecasts a bit in either direction and achieve marginal gains in accuracy. Systems that properly elicit and combine forecasts from thoughtful crowds also turn out to have real predictive power, especially on short time horizons.

Still, the future trajectories of individual cases of incipient revolution will remain hard to foresee with accuracy much beyond the banal prediction that tomorrow will most likely resemble today. That persistent fuzziness is not always what politicians, activists, investors, and other interested or just curious observers want to hear, but on this class of events, it’s probably as clairvoyant as we’re going to get.

The Political Power of Inertia

Political scientists devote a lot of energy to theorizing about dramatic changes—things like revolutions, coups, popular uprisings, transitions to democracy, and the outbreak of wars within and between states. These changes are fascinating and consequential, but they are also extremely rare. In politics, as in physics, inertia is a powerful force. Our imagination is drawn to change, but if we want to understand the world as it is, then we have to explain the prevalence of continuity as well.

Examples of inertia in politics are easy to find. War is justifiably a central concern for political science, but for many decades now, almost none of the thousands of potential wars within and between states have actually happened. Once a war does start, though, it often persists for years in spite of the tremendous costs involved. The international financial system suffers frequent and sometimes severe shocks and has no sovereign to defend it, and yet the basic structure of that system has persisted for decades. Whole journals are devoted to popular uprisings and other social movements, but they very rarely happen, and when they do, they often fail to produce lasting institutional change. For an array of important phenomena in the social sciences, by far the best predictor of the status of the system at time (t + 1) is the status of the system at time (t).

One field in which inertia gets its due is organization theory. A central theme in that neck of the intellectual woods is the failure of firms and agencies to adapt to changes in their environment and the search for patterns that might explain those failures. Some theories of institutional design at the level of whole political systems also emphasize stasis over change. Institutions are sometimes said to be “sticky,” meaning that they often persist in spite of evident flaws and available alternatives. As Paul Pierson observes, “Once established, patterns of political mobilization, the institutional ‘rules of the game,’ and even citizens’ basic ways of thinking about the political world will often generate self-reinforcing dynamics.”

In international relations and comparative politics, we see lots of situations in which actions that might improve the lot of one or more parties are not taken. These are situations in which inertia is evident, even though it appears to be counterproductive. We often explain failures to act in these situations as the result of collective action problems. As Mancur Olson famously observed, people, organizations, and other agents have diverse interests; action to try to produce change is costly; and the benefits of those costly actions are often diffuse. Under these circumstances, a tally of expected costs and benefits will often discourage agents from taking action, tempting them instead to forego those costs and look to free ride on the contributions of others instead.

Collective action problems are real and influential. Still, I wonder if our theories put too much emphasis on those system-level sources of inertia and too little on causes at the level of the individual. We like to think of ourselves as free and unpredictable, but humans really are creatures of habit. For example, a study published in 2010 in Science (here) used data sampled from millions of mobile-phone users to show that there is “a potential 93% average predictability” in where users go and when, “an exceptionally high value rooted in the inherent regularity of human behavior.” The authors conclude that,

Despite our deep-rooted desire for change and spontaneity, our daily mobility is, in fact, characterized by a deep-rooted regularity.

A related study (here) used mobility and survey data from Kenya and found essentially the same thing. Its authors reported that “mobility estimates are surprisingly robust to the substantial biases in phone ownership across different geographical and socioeconomic groups.” Apparently, this regularity is not unique to rich countries.

The microfoundations of our devotion to routine may be evident in neurobiology. Behavioral routines are physically expressed and reinforced in the development of neural pathways related to specific memories and actions, and in the thickening of the myelin sheaths that facilitate conduction along those pathways. The result is a virtuous or vicious circle, depending on the behavior and context. Athletes and musicians take advantage of this process through practice, but practice is mostly repetition, and repetition is a form of routine. Repetition begets habituation begets repetition.

This innate attachment to routine may contribute to political inertia. Norms and institutions are often regarded as clever solutions to collective action problems that would otherwise thwart our interests and aspirations. At least in part, those norms and institutions may also be social manifestations of an inborn and profound preference for routine and regularity.

In our theoretical imaginations, we privilege change over stasis. As alternative futures, however, the two are functionally equivalent, and stasis is vastly more common than change. In principle, our theories should cover both alternatives. In practice, that is very hard to do, and many of us choose to emphasize the dramatic over the routine. I wonder if we have chosen wrong.

For now, I’ll give the last word on this topic to Frank Rich. He wrote a nice essay for the October 20, 2014, issue of New York Magazine about an exercise in which he read his way back through the daily news from 1964 to compare it to the supposedly momentous changes afoot in 2014. His conclusion:

Even as we recognize that the calendar makes for a crude and arbitrary marker, we like to think that history visibly marches on, on a schedule we can codify.

The more I dove back into the weeds of 1964, the more I realized that this is both wishful thinking and an optical illusion. I came away with a new appreciation of how selective our collective memory is, and of just how glacially history moves.

Two Tidbits on Social Unrest

1. We like to tell tidy stories about why social unrest happens, and those stories usually involve themes of grievance or social injustice—things like hardship, inequality, corruption, discrimination, and political repression. One or more of those forces probably plays a role in many bouts of unrest, especially the ones that emerge from or evolve into sustained action like we’re seeing right now in Hong Kong and Ferguson.

Still, a riot over the weekend at a pumpkin festival in semi-rural Keene, New Hampshire, reminds us that you don’t need those big issues or themes to get to social unrest. According to the L.A. Times, in Keene,

Young people chucked beer cans and cups at each otherjumped off roofstore down, kicked and smashed road signsset a large fire and chanted profanitycelebrated on top of a flipped cartook selfies in front of lines of riot policegot the attention of a police helicopterchanted “U-S-A!”pushed barricades and threw a street sign at policethrew bottles at the police after the police threw tear gas, and left behind a huge mess.

Why? Who knows, but the main ingredients in this instance seem to have been youth, alcohol, numbers, and the pleasure of transgression:

The description of the scene in Keene reminded me of the riots that sometimes erupt in college towns and sports-mad cities after big games, some of which have proven extremely destructive. These riots differ qualitatively from the rallies, marches, sit-ins, and the like that social scientists generally study. For two things, they usually aren’t planned in advance, and the participants aren’t making political claims. Still, I think our understanding of those ostensibly more political forms of collective action suffers when we make our causal narratives too tidy and ignore the forces that also produce these other kinds of outbursts.

2. Contagion is one of those forces that seems to operate across many forms of unrest. We’re sure that’s true, but we still don’t understand very well how that process works. Observers often use dominoes as a metaphor for contagion, implying that a given unit must fall in order for the cascade to pass through it.

A new paper on arXiv proposes another mechanism that allows the impulse to “hop” some units—in other words, to pass through them without producing the same type of event or effect. Instead of dominoes, contagion might work more like a virus that some people can catch and transmit without ever becoming symptomatic themselves. The authors think this mechanism could help to explain the timing and sequencing of protests in the Arab Spring:

In models of protests and revolutions, populations can have two stable equilibria—the size of the protest is either large or negligibly small—because of strategic complementarities (protest becomes more attractive as more people protest). During the Arab Spring, each country had unique grievances and agendas, and we hypothesize that each country had a unique proximity to a tipping point beyond which people would protest. Once protests began in one country (Tunisia), inspiration to protest spread to other countries via traditional media (such as newspapers) and via social media (such as Twitter and Facebook). This cross-border communication spread strategies for successful uprisings, and it increased expectations for success. Consequently, the uprisings began within a short window of time, seemingly cascading among countries more quickly than earlier revolutions did.

In coarse-grained data on the number of Facebook friendships between countries, we find evidence of the “cascade hopping” phenomenon described above. In particular, Saudi Arabia and Egypt appear to play the role of an intermediate country Y that propagated influence to protest from protesting countries to non-protesting countries, thereby helping to trigger protest in the latter countries, without themselves protesting until much later. Attributes of these intermediate countries and of the countries that they may have influenced to protest suggest that protests first spread to countries close to their tipping points (high unemployment and economic inequality) and strongly coupled to other countries via social media (measured as high Internet penetration). By contrast, we find that traditional measures of susceptibility to protest, such as political freedoms and food price indices, could not predict the order in which protests began.

As with the structural and dynamic stuff discussed around this weekend’s riot in Keene, this hopping mechanism will never be the only force at work in any instance of social unrest. Even so, it’s a useful addition to the set of processes we ought to consider whenever we try to explain or predict where and when other instances might happen.

Forecasting Round-up No. 8

1. The latest Chronicle of Higher Education includes a piece on forecasting international affairs (here) by Beth McMurtrie, who asserts that

Forecasting is undergoing a revolution, driven by digitized data, government money, new ways to analyze information, and discoveries about how to get the best forecasts out of people.

The article covers terrain that is familiar to anyone working in this field, but I think it gives a solid overview of the current landscape. (Disclosure: I’m quoted in the piece, and it describes several research projects for which I have done or now do paid work.)

2. Yesterday, I discovered a new R package that looks to be very useful for evaluating and comparing forecasts. It’s called ‘scoring‘, and it does just that, providing functions to implement an array of proper scoring rules for probabilistic predictions of binary and categorical outcomes. The rules themselves are nicely discussed in a 2013 publication co-authored by the package’s creator, Ed Merkle, and Mark Steyvers. Those rules and a number of others are also discussed in a paper by Patrick Brandt, John Freeman, and Phil Schrodt that appeared in the International Journal of Forecasting last year (earlier ungated version here).

I found the package because I was trying to break the habit of always using the area under the ROC curve, or AUC score, to evaluate and compare the accuracy of forecasts from statistical models of rare events. AUC is quite useful as far as it goes, but it doesn’t address all aspects of forecast accuracy we might care about. Mathematically, the AUC score represents the probability that a prediction selected at random from the set of cases that had an event of interest (e.g., a coup attempt or civil-war onset) will be larger than a prediction selected at random from the set of cases that didn’t. In other words, AUC deals strictly in relative ranking and tells us nothing about calibration.

This came up in my work this week when I tried to compare out-of-sample estimates from three machine-learning algorithms—kernel-based regularized least squares (KRLS), Random Forests (RF), and support vector machines (SVM)—trained on and then applied to the same variables and data. In five-fold cross-validation, the three algorithms produced similar AUC scores, but histograms of the out-of-sample estimates showed much less variance for KRLS than RF and SVM. The mean out-of-sample “forecast” from all three was about 0.009, the base rate for the event, but the maximum for KRLS was only about 0.01, compared with maxes in the 0.4s and 0.7s for the others. It turned out that KRLS was doing about as well at rank ordering the cases as RF and SVM, but it was much more conservative in estimating the likelihood of an event. To consider that difference in my comparisons, I needed to apply scoring rules that were sensitive to forecast calibration and my particular concern with avoiding false negatives, and Merkle’s ‘scoring’ package gave me the functions I needed to do that. (More on the results some other time.)

3. Last week, Andreas Beger wrote a great post for the WardLab blog, Predictive Heuristics, cogently explaining why event data is so important to improving forecasts of political crises:

To predict something that changes…you need predictors that change.

That sounds obvious, and in one sense it is. As Beger describes, though, most of the models political scientists have built so far have used slow-changing country-year data to try to anticipate not just where but also when crises like coup attempts or civil-war onsets will occur. Some of those models are very good at the “where” part, but, unsurprisingly, none of them does so hot on the “when” part. Beger explains why that’s true and how new data on political events can help us fix that.

4. Finally, Chris Blattman, Rob Blair, and Alexandra Hartman have posted a new working paper on predicting violence at the local level in “fragile” states. As they describe in their abstract,

We use forecasting models and new data from 242 Liberian communities to show that it is to possible to predict outbreaks of local violence with high sensitivity and moderate accuracy, even with limited data. We train our models to predict communal and criminal violence in 2010 using risk factors measured in 2008. We compare predictions to actual violence in 2012 and find that up to 88% of all violence is correctly predicted. True positives come at the cost of many false positives, giving overall accuracy between 33% and 50%.

The patterns Blattman and Blair describe in that last sentence are related to what Beger was talking about with cross-national forecasting. Blattman, Blair, and Hartman’s models run on survey data and some other structural measures describing conditions in a sample of Liberian localities. Their predictive algorithms were derived from a single time step: inputs from 2008 and observations of violence from 2010. When those algorithms are applied to data from 2010 to predict violence in 2012, they do okay—not great, but “[similar] to some of the earliest prediction efforts at the cross-national level.” As the authors say, to do much better at this task, we’re going to need more and more dynamic data covering a wider range of cases.

Whatever the results, I think it’s great that the authors are trying to forecast at all. Even better, they make explicit the connections they see between theory building, data collection, data exploration, and prediction. On that subject, the authors get the last word:

However important deductive hypothesis testing remains, there is much to gain from inductive, data-driven approaches as well. Conflict is a complex phenomenon with many potential risk factors, and it is rarely possible to adjudicate between them on ex ante theoretical grounds. As datasets on local violence proliferate, it may be more fruitful to (on occasion) let the data decide. Agnosticism may help focus attention on the dependent variable and illuminate substantively and statistically significant relationships that the analyst would not have otherwise detected. This does not mean running “kitchen sink” regressions, but rather seeking models that produce consistent, interpretable results in high dimensions and (at the same time) improve predictive power. Unexpected correlations, if robust, provide puzzles and stylized facts for future theories to explain, and thus generate important new avenues of research. Forecasting can be an important tool in inductive theory-building in an area as poorly understood as local violence.

Finally, testing the predictive power of exogenous, statistically significant causes of violence can tell us much about their substantive significance—a quantity too often ignored in the comparative politics and international relations literature. A causal model that cannot generate predictions with some reasonable degree of accuracy is not in fact a causal model at all.

Follow

Get every new post delivered to your Inbox.

Join 8,792 other followers

%d bloggers like this: