How Social Science Is Like Microbiology

I’m almost finished reading Michael Pollan’s latest, Cooked. It’s a terrific book about food, but it’s also steeped in science, and I wanted to share a passage from the part of the book on fermentation that really resonated with me. Pollan is writing about microbiology, but the developments he identifies in that field could (or should) apply just as well to the social sciences. The passage starts like this:

In the decades since Louis Pasteur founded microbiology, medical research has focused mainly on bacteria’s role in causing disease. The bacteria that reside in and on our bodies were generally regarded as either harmless “commensals”—freeloaders, basically—or pathogens to be defended against. Scientists tended to study these bugs one at a time, rather than as communities. This was partly a deeply ingrained habit of reductive science, and partly a function of the available tools. Scientists naturally focused their attention on the bacteria they see, which meant the handful of individual bugs that could be cultivated in a petri dish. There, they found some good guys and some bad guys. But the general stance toward bacteria we had discovered all around us was shaped by metaphors of war, and in that war, antibiotics became the weapon of choice.

Cholera_bacteria_SEMThe “habit of reductive science” Pollan describes should be familiar to social scientists, too. We often sort the objects of our analysis into binary categories of helpful and hurtful, assume the objects we see are all there really is, and then design interventions to try to kill the bad without harming the good. Where microbiology has traditionally drawn a sharp line between pathogens and cells that belong, social science has neatly distinguished rebel groups and criminal gangs and patronage networks from bureaucracies and political parties and civil society. Where medicine has antibiotics, development practitioners have aid.

What we’re now learning, though, is that these lines are really much blurrier than we’ve assumed. Pollan goes on:

But it turns out that the overwhelming majority of bacteria residing in the gut simply refuse to grow on a petri dish—a phenomenon known among researchers as “the great plate anomaly.” Without realizing it, they were practicing what is sometimes called parking-lot science—named for the human tendency to search for lost keys under the streetlights not because that’s where we lost them but because that is where we can best see. The petri dish was a streetlight. But when, in the early 2000s, researchers developed genetic “batch” sequencing techniques allowing them to catalog all the DNA in a sample of soil, say, or seawater or feces, science suddenly acquired a broad and powerful beam light that could illuminate the entire parking lot. When it did, we discovered hundreds of new species in the human gut doing all sorts of unexpected things.

This, to me, is the promise of what Gary King calls the “social science data revolution.” Exponential growth in the production and distribution of information to and from all parts of the world, and in our collective capacity to store and process and analyze that information, are to the social sciences what genetic batch sequencing is to microbiology. Our libraries and limited professional networks were our petri dishes, and now they’ve been shattered—in a good way.

Pollan then describes where microbiology’s version of the data revolution has led:

To their surprise, microbiologists discovered that nine of every ten cells in our bodies do not belong to us, but to these microbial species (most of them residents of our gut), and that 99 percent of the DNA we’re carrying around belongs to those microbes. Some scientists, trained in evolutionary biology, began looking at the human individual in a humbling new light: as a kind of superorganism, a community of several hundred coevolved and interdependent species. War metaphors no longer made sense. So the microbiologists began borrowing new metaphors from the ecologists.

I’d say a comparable gestalt shift is occurring in some corners of social science, with similarly dramatic implications. For decades, we’ve cranked out snapshots and diagrams and typologies of objects—states, parties, militaries, ethnic groups—that we’ve assumed to be more or less static and distinct and told just-so stories about how one thing changes into another. Now, we’re shedding those functionalist assumptions and getting better at seeing those objects as permeable superorganisms embedded in ecosystems, all of them continually coevolving in ways that may elude our capacity to narrate, or even to understand at all. The implications are simultaneously thrilling and overwhelming.

The State of Democracy in Turkey

Is Turkey still a democracy? Was it ever? How many jailed journalists and canisters of tear gas does it take to get to authoritarian rule?

The best statement I’ve seen so far on what Turkey’s ongoing crisis says about the state of its national political regime comes from Steven A. Cook and Michael Koplow. For Foreign Policy, they write:

Turkish politics is not necessarily more open than it was a decade ago, when the [ruling Justice and Development Party, or AKP] was pursuing democratic reforms in order to meet the European Union’s requirements for membership negotiations. It is just closed in an entirely different way. Turkey has essentially become a one-party state… Successful democracies provide their citizens with ways in which to express their desires and frustrations beyond periodic elections, and Turkey has failed spectacularly in this regard.

Cook and Koplow’s piece is titled “How democratic is Turkey? Not as democratic as Washington thinks it is.” What that title and the essay that follows implicitly acknowledge is that the questions I posed at the start of this post are sometimes impossible to resolve with confidence.

130602213203-10-turkey-protests-0602-horizontal-gallery

I know this challenge well because as part of my work for the Political Instability Task Force, I used to have to make binary calls like that every year for all countries of the world with populations larger than half a million. To make those calls, I would apply a checklist I had developed to an assemblage of newspaper articles and reports from election observers and human-right groups and decide whether or not a country deserved to be called a democracy. My checklist was based on standard procedural definitions of democracy, and countries that failed to satisfy any one of the conditions established therein was labeled an autocracy. Either you’re in the club or you’re out.

That process and the data it produced made sense for certain research tasks, but they also swept under the rug the ambiguity and uncertainty that makes cases like Turkey right now so important for our understanding of what democracy is, and how it really emerges and recedes. Many regimes are easy to tag as democracies or autocracies, but there’s a sizable bloc that defies this bifurcation, and this bloc has only gotten larger in the past 25 years. As more and more states that long eschewed democratic procedures have adopted them, they have often done so in bits and pieces. What one hand has given in formal rules, the other has often taken away with informal practices and outright subterfuge that are meant to preserve the power distribution “real” democracy would threaten to overturn.

To understand what’s happening in these situations, I think Charles Tilly’s process-oriented approach to democracy is more useful. As Tilly says on page 24 of—what else?—Democracy, ”Democratization and de-democratization occur continuously, with no guarantee of an end point in either direction.” To structure our thinking about what those processes entail, he asserts that

A regime is democratic to the degree that political relations between the state and its citizens feature broad, equal, protected and mutually binding consultation.

Elections are the most obvious form that consultation takes, but they aren’t the only form, and states can hold free elections while screwing up the protection and mutually binding parts.

So is Turkey a democracy? Who knows, but as Cook and Koplow argue, it’s almost certainly less democratic than it was a few years ago. As Erdogan and his supporters keep pointing out, Turkey under the AKP seems to be doing fine on the most obvious version of broad and equal consultation, namely, elections. Where it’s plainly slipped is on the “protected and mutually binding consultation” part. The disturbingly frequent arrests of journalists and alleged coup plotters, and now the state’s overreaction to nonviolent protests on matters of routine public policy, give the lie to the claim the Turkish state gives all citizens equal treatment and due process. Instead, we see a regime in which (paraphrasing Tilly) state agents increasingly use their power to punish their perceived enemies and reward their friends.

On this point, a couple of comments Prime Minister Erdogan made in a speech on Saturday speak volumes. Live-tweeting that speech, Turkish journalist Mahir Zeynalov spotlighted these choice remarks:

What those remarks reveal is a state that is happy to appeal to the citizens who reliably support it but closes off consultation with, and even bullies, the ones who don’t. The resulting regime may still be recognizable as a variation on the theme of democracy, but the discordant notes of authoritarianism are plainly audible and keep growing louder.

Lost in the Fog of Civil War in Syria

On Twitter a couple of days ago, Adam Elkus called out a recent post on Time magazine’s World blog as evidence of the way that many peoples’ expectations about the course of Syria’s civil war have zigged and zagged over the past couple of years. “Last year press was convinced Assad was going to fall,” Adam tweeted. “Now it’s that he’s going to win. Neither perspective useful.” To which the eminent civil-war scholar Stathis Kalyvas replied simply, “Agreed.”

There’s a lesson here for anyone trying to glean hints about the course of a civil war from press accounts of a war’s twists and turns. In this case, it’s a lesson I’m learning through negative feedback.

Since early 2012, I’ve been a participant/subject in the Good Judgment Project (GJP), a U.S. government-funded experiment in “wisdom of crowds” forecasting. Over the past year, GJP participants have been asked to estimate the probability of several events related to the conflict in Syria, including the likelihood that Bashar al-Assad would leave office and the likelihood that opposition forces would seize control of the city of Aleppo.

I wouldn’t describe myself as an expert on civil wars, but during my decade of work for the Political Instability Task Force, I spent a lot of time looking at data on the onset, duration, and end of civil wars around the world. From that work, I have a pretty good sense of the typical dynamics of these conflicts. Most of the civil wars that have occurred in the past half-century have lasted for many years. A very small fraction of those wars flared up and then ended within a year. The ones that didn’t end quickly—in other words, the vast majority of these conflicts—almost always dragged on for several more years at least, sometimes even for decades. (I don’t have my own version handy, but see Figure 1 in this paper by Paul Collier and Anke Hoeffler for a graphical representation of this pattern.)

On the whole, I’ve done well in the Good Judgment Project. In the year-long season that ended last month, I ranked fifth among the 303 forecasters in my experimental group, all while the project was producing fairly accurate forecasts on many topics. One thing that’s helped me do well is my adherence to what you might call the forecaster’s version of the Golden Rule: “Don’t neglect the base rate.” And, as I just noted, I’m also quite familiar with the base rates of civil-war duration.

So what did I do when asked by GJP to think about what would happen in Syria? I chucked all that background knowledge out the window and chased the very narrative that Elkus and Kalyvas rightly decry as misleading.

Here’s a chart showing how I assessed the probability that Assad wouldn’t last as president beyond the end of March 2013, starting in June 2012. The actual question asked us to divide the probability of his exiting office across several time periods, but for simplicity’s sake I’ve focused here on the part indicating that he would stick around past April 1. This isn’t the same thing as the probability that the war would end, of course, but it’s closely related, and I considered the two events as tightly linked. As you can see, until early 2013, I was pretty confident that Assad’s fall was imminent. In fact, I was so confident that at a couple of points in 2012, I gave him zero chance of hanging on past March of this year—something a trained forecaster really never should do.

gjp assad chart

Now here’s another chart showing my estimates of the likelihood that rebels would seize control of Aleppo before May 1, 2013. The numbers are a little different, but the basic pattern is the same. I started out very confident that the rebels would win the war soon and only swung hard in the opposite direction in early 2013, as the boundaries of the conflict seemed to harden.

gjp aleppo chart

It’s impossible to say what the true probabilities were in this or any other uncertain situation. Maybe Assad and Aleppo really were on the brink of falling for a while and then the unlikely-but-still-possible version happened anyway.

That said, there’s no question that forecasts more tightly tied to the base rate would have scored a lot better in this case. Here’s a chart showing what my estimates might have looked like had I followed that rule, using approximations of the hazard rate from the chart in the Collier and Hoeffler paper. If anything, these numbers overstate the likelihood that a civil war will end at a given point in time.

gjp baserate chart

I didn’t keep a log spelling out my reasoning at each step, but I’m pretty confident that my poor performance here is an example of motivated reasoning. I wanted Assad to fall and the pro-democracy protesters who dominated the early stages of the uprising to win, and that desire shaped what I read and then remembered when it came time to forecast. I suspect that many of the pieces I was reading were slanted by similar hopes, creating a sort of analytic cascade similar to the herd behavior thought to drive many financial-market booms and busts. I don’t have the data to prove it, but I’m pretty sure the ups and downs in my forecasts track the evolving narrative in the many newspaper and magazine stories I was reading about the Syrian conflict.

Of course, that kind of herding happens on a lot of topics, and I was usually good at avoiding it. For example, when tensions ratcheted up on the Korean Peninsula earlier this year, I hewed to the base rate and didn’t substantially change my assessment of the risk that real clashes would follow.

What got me in the case of Syria was, I think, a sense of guilt. The Assad government has responded to a legitimate popular challenge with mass atrocities that we routinely read about and sometimes even see. In parts of the country, the resulting conflict is producing scenes of absurd brutality. This isn’t a “problem from hell,” as Samantha Powers’ book title would have it; it is a glimpse of hell. And yet, in the face of that horror, I have publicly advocated against American military intervention. Upon reflection, I wonder if my wildly optimistic forecasting about the imminence of Assad’s fall wasn’t my unconscious attempt to escape the discomfort of feeling complicit in the prolongation of that suffering.

As a forecaster, if I were doing these questions over, I would try to discipline myself to attend to the base rate, but I wouldn’t necessarily stop there. As I’ve pointed out in a previous post, the base rate is a valuable anchoring device, but attending to it doesn’t mean automatically ignoring everything else. My preferred approach, when I remember to have one, is to take that base rate as a starting point and then use Bayes’ theorem to update my forecasts in a more disciplined way. Still, I’ll bring a newly skeptical eye the flurry of stories predicting that Assad’s forces will soon defeat Syria’s rebels and keep their patron in power. Now that we’re a couple years into the conflict, quantified history tells us that the most likely outcome in any modest slice of time (say, months rather than years) is, tragically, more of the same.

And, as a human, I’ll keep hoping the world will surprise us and take a different turn.

“They Said It Was Going to Rain”

Most Saturdays and some Sundays, I hook up with a bike ride that winds out of DC’s Rock Creek Park into semi-rural Maryland and back again over the course of a few hours. I depend on this ride for hard training and a shot of competition, but I’m a wet-weather wimp and will usually stay home and use the trainer in my basement if it’s raining or probably going to rain. So, one of the first things I do when I get up most weekend mornings is check the hourly forecasts at weather.com and Weather Underground. If there’s much risk of rain, I’ll open the radar map again close to my 9:45 departure and run the animated forecast for the next few hours. If that animation shows yellow or orange blobs swarming my regular route when I’m going to be on it, I almost always stay in.

One recent Sunday, the forecast had me hemming and hawing for a bit before I decided to go. The hourly breakout at weather.com pegged the chance of rain at 70 percent for the first couple of hours I’d be out, but it wasn’t raining at 9:30 and the radar map didn’t look bad, either. Updating completed, out I went.

The weather often dominates conversations at the start and finish of the ride, and on that Sunday two themes rang through the chatter I overheard: we’d gotten really lucky, and weather forecasters are idiots. “They said it was going to rain,” the Greek chorus kept repeating.

wet paris roubaix

But, of course, that’s not what “they” said. In point of fact, meteorologists had pegged the odds of rain at about 2:1. According to those forecasts, it was probably going to rain, but the chances that it would stay dry weren’t so bad, either. I wouldn’t bet my mortgage on a probability of 0.3, but I’m okay with occasionally risking a soggy ride on one.

As a weather-wimpy cyclist, I was happy to catch the lucky break that Sunday. As a guy who sometimes forecasts for a living, I was intrigued by the consistent way in which so many people had distorted that probability. In our heads, the quantified uncertainty we saw in the paper or on the web was transformed into a categorical prediction of rain. What the modeler would want to contextualize before assessing—”For all of the hours I said there was a 70-percent chance of rain, how often did rain actually happen?”—the intended audience was fine judging in isolation and declaring, “Wrong!”

That we’re not so great at processing probabilities won’t surprise anyone familiar with psychological research from the past few decades on that subject. Exactly what form that bias takes under what conditions, though, still seems to be something of a mystery. In a New York Times blog post about forecasts of the U.S. presidential election, statistician Andrew Gelman wrote:

What if the weatherman told you there was a 30 percent chance of rain—would you be shocked if it rained that day? No.

Apparently, Gelman hasn’t met the crew from my weekend ride. Gelman goes on to connect his assertion to work by Amos Tversky and Daniel Kahneman on prospect theory, which is based, in part, on the expectation people systematically overestimate the risk of low-probability events and underestimate the risk of high-probability ones. That expectation, in turn, is based on empirical research that has been replicated elsewhere, as the following chart shows:

probability weighting estimates

What’s puzzling to me here is that my fellow riders seemed to be distorting things in the opposite direction. Instead of taking a probability of 0.7 and thinking of it as a toss-up as Gelman and that chart predict they would, they had converted it into a sure thing. That’s still bias, of course—just not the kind I would have expected.

If there’s a moral to this story, it’s that we still have a lot of work left to do in understanding how we cogitate on uncertainty and what that implies about how we should produce and present probabilistic forecasts. In many domains, we’re getting better and better at the forecasting part, but even very accurate forecasts are only as useful as we make them or let them be. To get from the one to the other, we still need to learn a lot more about how we process and act on that information—not just individually, but also organizationally and socially.

Challenges in Measuring Violent Conflict, Syria Edition

As part of a larger (but, unfortunately, gated) story on how the terrific new Global Data on Events, Language, and Tone (GDELT) might help social scientists forecast violent conflicts, the New Scientist recently posted some graphics using GDELT to chart the ongoing civil war in Syria. Among those graphics was this time-series plot of violent events per day in Syria since the start of 2011:

Syrian Conflict   New Scientist

Based on that chart, the author of the story (not the producers of GDELT, mind you) wrote:

As Western leaders ponder intervention, the resulting view suggests that the violence has subsided in recent months, from a peak in the third quarter of 2012.

That inference is almost certainly wrong, and why it’s wrong underscores one of the fundamental challenges in using event data—whether it’s collected and coded by software or humans or some combination thereof—to observe the dynamics of violent conflict.

I say that inference is almost certainly wrong because concurrent data on deaths and refugees suggest that violence in Syria has only intensified in past year. One of the most reputable sources on deaths from the war is the Syria Tracker. A screenshot of their chart of monthly counts of documented killings is shown below. Like GDELT, their data also identify a sharp increase in violence in late 2012. Unlike GDELT, their data indicate that the intensity of the violence has remained very high since then, and that’s true even though the process of documenting killings inevitably lags behind the actual violence.

Syria Tracker monthly death counts

We see a similar pattern in data from the U.N. High Commissioner on Refugees (UNHCR) on people fleeing the fighting in Syria. If anything, the flow of refugees has only increased in 2013, suggesting that the violence in Syria is hardly abating.

UNHCR syria refugee plot

The reason GDELT’s count of violent events has diverged from other measures of the intensity of the violence in Syria in recent months is probably something called “media fatigue.” Data sets of political events generally depend on news sources to spot events of interest, and it turns out that news coverage of large-scale political violence follows a predictable arc. As Deborah Gerner and Phil Schrodt describe in a paper from the late 1990s, press coverage of a sustained and intense conflicts is often high when hostilities first break out but then declines steadily thereafter. That decline can happen because editors and readers get bored, burned out, or distracted. It can also happen because the conflict gets so intense that it becomes, in a sense, too dangerous to cover. In the case of Syria, I suspect all of these things are at work.

My point here isn’t to knock GDELT, which is still recording scores or hundreds of events in Syria every day, automatically, using open-source code, and then distributing those data to the public for free. Instead, I’m just trying to remind would-be users of any data set of political events to infer with caution. Event counts are one useful way to track variation over time in political processes we care about, but they’re only one part of the proverbial elephant, and they are inevitably constrained by the limitations of the sources from which they draw. To get a fuller sense of the beast, we need as often as possible to cross-reference those event data with other sources of information. Each of the sources I’ve cited here has its own blind spots and selection biases, but a comparison of trends from all three—and, importantly, an awareness of the likely sources of those biases—is enough to give me confidence that the civil war in Syria is only continuing to intensify. That says something important about Syria, of course, but it also says something important about the risks of drawing conclusions from event counts alone.

PS. For a great discussion of other sources of bias in the study of political violence, see Stathis Kalyvas’ 2004 essay on “The Urban Bias in Research on Civil Wars” (PDF).

Dart-Throwing Chimp Does TEDxTbilisi

Last month, I traveled to Georgia (the country) to give a talk at the second annual TEDxTbilisi. In that talk, I used stories about shoddy infrastructure to explore the gap between conventional theories and my own understanding of the things that cause authoritarian regimes to persist and then collapse. Called “Why Dictators Build Stuff that Crumbles,” my script was basically a mash-up of a couple of blog posts from the past year: one of nearly the same name, and another on why political activism over threats to public health and safety presents authoritarian regimes with special dilemmas.

The event was terrific—full house, great venue, good refreshments—and the small army of volunteers it took to make TEDxTbilisi happen did tremendous work. To readers of this blog, I’d especially recommend these four talks:

* Dato Gogigchaishvili, a Georgian television host and producer, gave a really smart and funny talk that probed the truth and limits of cross-cultural comparisons.

* Rusudan Gotsiridze spoke beautifully and humorously about gender roles through the lens of her own experiences as the first female bishop in Georgia.

* Educators and parents will appreciate the talk by Mark Rein-Hagen, a professional game designer, about learning through playing.

* The theme for TEDxTbilisi this year was “crossroads,” and Donald Rayfield capped the day with a great talk about Georgia’s long and difficult history as a place squished in between other, more powerful states and empires.

Honestly, preparing for the event was a lot harder than I’d expected. Having a blog where I regularly try to present social-science ideas to a broader audience made the initial task of identifying a relevant topic and drafting a script easier than they might have been. That part, I actually enjoyed. Much harder for me were committing the talk to memory and rehearsing it enough so that it (hopefully) didn’t look and sound too canned.

I’m sure the memory and delivery parts are easier for some people than others, and I suspect they get easier when you do them routinely. They were new to me, though, and I put a lot of hours into it over the two weeks before the event, reading out loud and then practicing versions of the talk. The closer I got to the trip, the more of my intellectual processing power it seemed to absorb. I was a lousy creative thinker that last week, and once in that home stretch I completely whiffed on a phone call I was supposed to make for work, something I never do. Having been through this once, I’m much more impressed with the people who make that kind of performance look natural and effortless than I used to be.

Finally, I gotta say, the process was exhausting. I am a creature of habit who rarely travels for work and almost never travels overseas. My TEDxTbilisi trip was a five-day blast with opening and closing legs of 24-hour travel to and from a city eight time zones ahead of home. During the three days I was in Tbilisi, the combination of jet lag, noise and cigarette smoke in the hotel, caffeine withdrawal, and anxiety about the impending event meant that I slept poorly. I used to race a lot as a runner and then a cyclist, and one of the big rules of thumb in those worlds is to stick to normal routines as much as possible before important races to keep the stress down and energy and focus up. Here, I’d basically done the opposite, shaking up everything I normally do. If I’d had my druthers, I’d have taken my first crack at this kind of thing under less stressful circumstances.

Of course, in real life you take what you can get, and in TEDx Tbilisi I got a great opportunity. If hope you enjoy the talk.

I’m Down with Complexity and All, But…

In a recent Scientific American blog post called “Big Data Needs a Big Theory“, Geoffrey West calls for a unified theory of complex systems that will advance our understanding of, and capacity to predict, stasis and change in many domains. Quoting at length:

The digital revolution is driving much of the increasing complexity and pace of life we are now seeing, but this technology also presents an opportunity… With new computational tools and techniques to digest vast, interrelated databases, researchers and practitioners in science, technology, business and government have begun to bring large-scale simulations and models to bear on questions formerly out of reach of quantitative analysis, such as how cooperation emerges in society, what conditions promote innovation, and how conflicts spread and grow.

The trouble is, we don’t have a unified, conceptual framework for addressing questions of complexity. We don’t know what kind of data we need, nor how much, or what critical questions we should be asking. “Big data” without a “big theory” to go with it loses much of its potency and usefulness, potentially generating new unintended consequences.

When the industrial age focused society’s attention on energy in its many manifestations—steam, chemical, mechanical, and so on—the universal laws of thermodynamics came as a response. We now need to ask if our age can produce universal laws of complexity that integrate energy with information. What are the underlying principles that transcend the extraordinary diversity and historical contingency and interconnectivity of financial markets, populations, ecosystems, war and conflict, pandemics and cancer? An overarching predictive, mathematical framework for complex systems would, in principle, incorporate the dynamics and organization of any complex system in a quantitative, computable framework.

We will probably never make detailed predictions of complex systems, but coarse-grained descriptions that lead to quantitative predictions for essential features are within our grasp. We won’t predict when the next financial crash will occur, but we ought to be able to assign a probability of one occurring in the next few years. The field is in the midst of a broad synthesis of scientific disciplines, helping reverse the trend toward fragmentation and specialization, and is groping toward a more unified, holistic framework for tackling society’s big questions.

Not to put too fine a point on it, but I think that agenda is unrealistic.

I agree with West that human social systems are best understood as complex systems in the technical sense of that term (see here). Still, on the possibility of law-like regularities in complex systems that extend to large-scale human social behavior and are usefully predictive, I’m skeptical. It’s hard for me to imagine what those laws would look like, but then I know that my incapacity to understand the universe is not a reliable indicator of the universe’s inherent regularity or intelligibility.protein_network

At the same time, I think West’s analogizing to physics and the laws of thermodynamics ignores the single most-important difference between the “natural” sciences and the social sciences, namely, the (in)ability to perform true experiments. (N.B. Humans and their social interactions are, of course, entirely “natural,” too, but these are the terms we conventionally use.) Social scientists can only observe the systems we study; we can’t repeatedly perturb them in specific ways under tightly controlled conditions and see how things play out.

The impossibility of experimentation means we’re never going to be able to see the counterfactuals we’d need to see to make clear and confident inferences about rules or laws. That doesn’t mean we can’t find some robust patterns, but those patterns will never be anywhere near as universal and specific as the laws of thermodynamics.

The fuzziness of our understanding also means that the patterns we do see will have only modest predictive power at best. Those fuzzy patterns will allow us to assess differences in propensities with some success, as they already do now, but they will not lead us to sharply accurate predictions about the timing and details of change.

More important, those patterns themselves will change over time, as the underlying system continues to evolve. As West suggests, the changes that are creating new opportunities for analysis are themselves products of exponential growth in the complexity of human society. It’s an empirical question, I suppose, but I find it hard to believe that the processes which beget conflicts between states in the middle of the twenty-first century—an age of nukes and mega-cities and deep globalization—will resemble the processes that begat World Wars I and II in all but the most banal ways. And, of course, that’s assuming that states in the conventional sense are even still around.

Sovereignty Without Territoriality?

The concentration of manpower was the key to political power in premodern Southeast Asia… This overwhelming concern for obtaining and holding population at the core is shot through every aspect of precolonial statecraft. What Geertz says about Balinese political rivalries—that they were “a struggle more for men than for land”—could apply equally to all of mainland Southeast Asia. This principle animated the conduct of warfare, which was less a grab for distant territory than a quest for captives who could be resettled at the core… Early European officials were frequently astounded by the extremely vague demarcation of territories and provinces in their new colonies and puzzled by an administration of manpower that had little or nothing to do with territorial jurisdiction… As Thongchai Winichakul’s insightful book shows, the Siamese paid more attention to the manpower they could summon than to sovereignty over land that had no value in the absence of labor.

That’s from Chapter 3 (pp. 64-68) of James Scott’s The Art of Not Being Governed. To an inhabitant of the “modern” world who studies international politics, Scott’s description of powerful states that only vaguely demarcated and policed their putative territorial boundaries serves as an intriguing reminder that the fusion of territoriality and political sovereignty we now take for granted is not inevitable. Organizations can and have exercised substantial authority over human society without husbanding exclusive control over specific patches of land. Scott sees similar processes at work in nineteenth– and twentieth-century sub-Saharan Africa:

The theme of manpower concentration permeates the literature on indigenous politics: “The drive to acquire relatives, adherents, dependents, retainers, and subjects and to keep them attached to oneself as a kind of social and political ‘capital’ has often been remarked upon as characteristic of African political processes.”… As in Southeast Asia there was little emphasis on sharp territorial boundaries, and the important rights were over people, not places, except for particular ritual sites. The competition for followers, kinsmen, and bondsmen operated at every level.

In fact, I’d say there are at least three interconnected but distinct spaces in which political authority can be organized—physical (territory), social (people), and economic (trade)—and the three don’t necessarily have to hang together. Scott has already described for us states whose sovereignty was rooted primarily in the social and economic realms with less attention to territory.

Contemporary drug cartels arguably exemplify the possibility of organizations that compete for power in trade space without asserting sovereignty over territory or society in the way that modern states do. Large cartels sometimes attempt to establish territorial zones of impunity or even governance, but those efforts often come in response to rivals’ attempts to quash their power in trade space. More important, the point of that territorial control is usually to gain freedom from interference in their economic activities, not to assert the full panoply of political authority we attach to the modern idea of sovereignty. As John Sullivan says of contemporary “criminal insurgencies” in Mexico and elsewhere,

Organized crime groups (gangs and cartels)…usually seek to elude detection and prefer co-opting (corrupting) the instruments of state rather than engaging in direct confrontation… Yet as the current crime wars illustrate, these actors can directly confront the state when their interests are challenged (Bailey & Talyor, 2009).  Criminal insurgency is the mechanism of the confrontation with the state that results when relationships between organized crime and the state fall into disequilibrium.

Criminal insurgency presents a challenge to states and communities. Criminal insurgency is different from conventional terrorism and insurgency because the criminal insurgents’ sole political motive is to gain autonomy and economic control over territory. They do so by hollowing out the state and creating criminal enclaves to secure freedom to maneuver.

It’s harder for me to think of an organization that competes for sovereignty in the social realm without seeking control over territory or trade. I suppose organized religion comes closest. Although some hierarchical religious organizations historically have also pursued control over land and trade, in ideological terms, their main claim attaches to the souls of their adherents and nothing else. Ethnicity might fit the bill, too, insofar as leaders of these communities of putative kinship claim authority over members wherever they may be and whatever trade they might take up. It’s also interesting to think about whether or not cyberspace is emerging as a fourth realm for political organization, intertwined with but at least partially independent of the other three, but that’s a question for another day.

What’s confusing to modern ears, I think, is the application of the word “state” to these other things. Scott explicitly did so, and I’m implicitly doing so here. My point in doing so is to highlight that the constructs we call “states” are just one of many organizations constantly competing for power in these various spaces. What’s unique about the modern state is its explicit claim to dominion over all three of those spaces—physical, social, and economic—within a particular set of sharply demarcated borders.

So, let’s flip it around: instead of calling all of these organizations states, let’s reserve that term for the modern thing, but let’s allow Scott’s passage to remind us that states are neither as inevitable nor as successful in their efforts to establish that dominion as we often assume. Instead, they are just one organizational form competing for sovereignty in these various realms, and their success in those struggles is neither as complete nor as final as they would like it to be. The fusion of sovereignty in the modern state is a specific idea, not a natural fact, and a self-serving one at that.

Road-Testing GDELT as a Resource for Monitoring Atrocities

As I said here a few weeks ago, I think the Global Dataset on Events, Location, and Tone (GDELT) is a fantastic new resource that really embodies some of the ways in which technological changes are coming together to open lots of new doors for social-scientific research. GDELT’s promise is obvious: more than 200 million political events from around the world over the past 30 years, all spotted and coded by well-trained software instead of the traditional armies of undergrad RAs, and with daily updates coming online soon. Or, as Adam Elkus’ t-shirt would have it, “200 million observations. Only one boss.”

BUT! Caveat emptor! Like every other data-collection effort ever, GDELT is not alchemy, and it’s important that people planning to use the data, or even just to consume analysis based on it, understand what its limitations are.

I’m starting to get a better feel for those limitations from my own efforts to use GDELT to help observe atrocities around the world, as part of a consulting project I’m doing for the U.S. Holocaust Memorial Museum’s Center for the Prevention of Genocide. The core task of that project is to develop plans for a public early-warning system that would allow us to assess the risk of onsets of atrocities in countries worldwide more accurately and earlier than current practice.

When I heard about GDELT last fall, though, it occurred to me that we could use it (and similar data sets in the pipeline) to support efforts to monitor atrocities as well. The CAMEO coding scheme on which GDELT is based includes a number of event types that correspond to various forms of violent attack and other variables indicating who was doing attacking whom. If we could develop a filter that reliably pulled events of interest to us from the larger stream of records, we could produce something like a near-real time bulletin on recent violence against civilians around the world. Our record would surely have some blind spots—GDELT only tracks a limited number of news sources, and some atrocities just don’t get reported, period—but I thought it would reliably and efficiently alert us to new episodes of violence against civilians and help us identify trends in ongoing ones.

Well, you know what they say about plans and enemies and first contact. After digging into GDELT, I still think we can accomplish those goals, but it’s going to take more human effort than I originally expected. Put bluntly, GDELT is noisier than I had anticipated, and for the time being the only way I can see to sharpen that signal is to keep a human in the loop.

Imagine (fantasize?) for a moment that there’s a perfect record somewhere of all the political interactions GDELT is trying to identify. For kicks, let’s call it the Encyclopedia Eventum (EE). Like any detection system, GDELT can mess up in two basic ways: 1) errors of omission, in which GDELT fails to spot something that’s in the EE; and 2) errors of commission, in which it mistakenly records an event that isn’t in the EE (or, relatedly, is in the EE but in a different place). We might also call these false negatives and false positives, respectively.

At this point, I can’t say anything about how often GDELT is making errors of omission, because I don’t have that Encyclopedia Eventum handy. A more realistic strategy for assessing the rate of errors of omission would involve comparing a subset of GDELT to another event data set that’s known to be a fairly reliable measure for some time and place of something GDELT is meant to track—say, protest and coercion in Europe—and see how well they match up, but that’s not a trivial task, and I haven’t tried it yet.

Instead, the noise I’m seeing is on the other side of that coin: the errors of commission, or false positives. Here’s what I mean:

To start developing my atrocities-monitoring filter, I downloaded the reduced and compressed version of GDELT recently posted on the Penn State Event Data Project page and pulled the tab-delimited text files for a couple of recent years. I’ve worked with event data before, so I’m familiar with basic issues in their analysis, but every data set has its own idiosyncrasies. After trading emails with a few CAMEO pros and reading Jay Yonamine’s excellent primer on event aggregation strategies, I started tinkering with a function in R that would extract the subset of events that appeared to involve lethal force against civilians. That function would involve rules to select on three features: event type, source (the doer), and target.

  • Event Type. For observing atrocities, type 20 (“Engage in Unconventional Mass Violence”) was an obvious choice. Based on advice from those CAMEO pros, I also focused on 18 (“Assault”) and 19 (“Fight”) but was expecting that I would need to be more restrictive about the subtypes, sources, and targets in those categories to avoid errors of commission.
  • Source. I’m trying to track violence by state and non-state agents, so I focused on GOV (government), MIL (Military), COP (police), and intelligence agencies (SPY) for the former and REB (militarized opposition groups) and SEP (separatist groups) for the latter. The big question mark was how to handle records with just a country code (e.g., “SYR” for Syria) and no indication of the source’s type. My CAMEO consultants told me these would usually refer in some way to the state, so I should at least consider including them.
  • Target. To identify violence against civilians, I figured I would get the most mileage out of the OPP (non-violent political opposition), CVL (“civilians,” people in general), and REF (refugees) codes, but I wanted to see if the codes for more specific non-state actors (e.g., LAB for labor, EDU for schools or students, HLH for health care) would also help flag some events of interest.

After tinkering with the data a bit, I decided to write to separate functions, one for events with state perpetrators and another for events with non-state perpetrators. If you’re into that sort of thing, you can see the state-perpetrator version of that filtering function on Github, here.

When I ran the more than 9 million records in the “2011.reduced.txt” file through that function, I got back 2,958 events. So far, so good. As soon as I started poking around in the results, though, I saw a lot of records that looked . The current release of GDELT doesn’t include text from or links to the source material, so it’s hard to say for sure what real-world event any one record describes. Still, some of the perpetrator-and-target combos looked odd to me, and web searches for relevant stories either came up empty or reinforced my suspicions that the records were probably errors of commission. Here are a few examples, showing the date, event type, source, and target:

  • 1/8/2011 193 USAGOV USAMED. Type 193 is “Fight with small arms and light weapons,” but I don’t think anyone from the U.S. government actually got in a shootout or knife fight with American journalists that day. In fact, that event-source-target combination popped up a lot in my subset.
  • 1/9/2011 202 USAMIL VNMCVL. Taken on its face, this record says that U.S. military forces killed Vietnamese civilians on January 9, 2011. My hunch is that the story on which this record is based was actually talking about something from the Vietnam War.
  • 4/11/2011 202 RUSSPY POLCVL. This record seems to suggest that Russian intelligence agents “engaged in mass killings” of Polish civilians in central Siberia two years ago. I suspect the story behind this record was actually talking about the Kaytn Massacre and associated mass deportations that occurred in April 1940.

That’s not to say that all the records looked wacky. Interleaved with these suspicious cases were records representing exactly the kinds of events I was trying to find. For example, my filter also turned up a 202 GOV SYRCVL for June 10, 2011, a day on which one headline blared “Dozens Killed During Syrian Protests.”

Still, it’s immediately clear to me that GDELT’s parsing process is not quite at the stage where we can peruse the codebook like a menu, identify the morsels we’d like to consume, phone our order in, and expect to have exactly the meal we imagined waiting for us when we go to pick it up. There’s lots of valuable information in there, but there’s plenty of chaff, too, and for the time being it’s on us as researchers to take time to try to sort the two out. This sorting will get easier to do if and when the posted version adds information about the source article and relevant text, but “easier” in this case will still require human beings to review the results and do the cross-referencing.

Over time, researchers who work on specific topics—like atrocities, or interstate war, or protest activity in specific countries—will probably be able to develop supplemental coding rules and tweak their filters to automate some of what they learn. I’m also optimistic that the public release of GDELT will accelerate improvements the software and dictionaries it uses, expanding its reach while shrinking the error rates. In the meantime, researchers are advised to stick to the same practices they’ve always used (or should have, anyway): take time to get to know your data; parse it carefully; and, when there’s no single parsing that’s obviously superior, check the sensitivity of your results to different permutations.

PS. If you have any suggestions on how to improve the code I’m using to spot potential atrocities or otherwise improve the monitoring process I’ve described, please let me know. That’s an ongoing project, and even marginal improvements in the fidelity of the filter would be a big help.

PPS. For more on these issues and the wider future of automated event coding, see this ensuing post from Phil Schrodt on his blog.

Hello?!? Not All Forecasters Are Strict Positivists

International relations is the most predictively oriented subfield of political science…Yet even in the other empirical subfields, the positivist notion that everything must ultimately be reducible to (knowable) universal laws displays its hold in excrescences such as quadrennial attempts to derive formulae for predicting the next presidential election outcome, usually on the basis of ‘‘real’’ (economic) factors. Even if one follows Milton Friedman (1953) in insisting that the factors expressed by such formulae are not supposed to be actually causing electoral outcomes, but are merely variables that (for some unknown reason) allow us to make good behavioral predictions, in practice one usually wants to know what is actually causing the behavior, and it is all too easy to assume that whatever is causing it—since it seems to be responsible for a behavioral regularity—must be some universal human disposition.

That’s from a 2012 paper by Jeffrey Friedman on Robert Jervis’ 1997 System Effects and the “problem of prediction.” I actually enjoyed the paper on the whole, but this passage encapsulates what drives me nuts about what many people—including many social “scientists”—think it means to try to make forecasts about politics.

Contrary to the assertions of some haters, political scientists almost never make explicit forecasts about the things they study—at least not in print or out loud. Some of that reticence presumably results from the fact that there’s no clear professional benefit to making predictions, and there is some professional risk in doing so and then being wrong.

Some of that reticence, though, also seems to flow from this silly but apparently widely-held idea that the very act of forecasting implies that the forecaster accepts the strict positivist premise that “everything must ultimately be reducible to (knowable) universal laws.” To that, I say…

charlie brown aaugh

Probability is a mathematical representation of uncertainty, and a probabilistic forecast explicitly acknowledges that we don’t know for sure what’s going to happen. Instead, it’s an educated guess—or, in Bayesian terms, an informed belief.

Forecasters generally use evidence from the past to educate those guesses, but that act of empiricism in itself does not imply that we presume there are universal laws driving political processes lurking beneath that history. Instead, it’s really just a practical solution to the problem of wanting better information—sometimes to help us plan for the future, and sometimes to try to adjudicate between different ideas about the forces shaping those processes now and in the past.

Empiricism is a practical solution because it works—not perfectly, of course, but, for many problems of interest, a lot better than casting bones or reading entrails or consulting oracles. The handful of forecasters I know all embrace the premises that their efforts are only approximations, and that the world can always change in ways that will render the models we find helpful today less helpful in the future. In the meantime, though, we figure we can nibble away at our ignorance by making structured guesses about that future and seeing which ones turn out to be more reliable than the others. Physicists still aren’t entirely sure how planes manage to fly, but millions of us make a prediction every day that the plane we’re about to board is somehow going to manage that feat. We don’t need to be certain of the underlying law to find that prediction useful.

Finally, I can’t resist: there’s real irony in Freidman’s choice of examples of misguided forecasting projects. To have called efforts to predict the outcome of U.S. presidential elections “excrescences” in the year those excrescences had a kind of popular coming out, well, that’s just unfortunate. I guess Friedman didn’t see that one coming.

Follow

Get every new post delivered to your Inbox.

Join 3,482 other followers

%d bloggers like this: