Be Vewy, Vewy Quiet

This blog has gone relatively quiet of late, and it will probably stay that way for a while. That’s partly a function of my personal life, but it also reflects a conscious decision to spend more time improving my abilities as a programmer.

I want to get better at scraping, making, munging, summarizing, visualizing, and analyzing data. So, instead of contemplating world affairs, I’ve been starting to learn Python; using questions on Stack Overflow as practice problems for R; writing scripts that force me to expand my programming skills; and building Shiny apps that put those those skills to work. Here’s a screenshot of one app I’ve made—yes, it actually works—that interactively visualizes ACLED’s latest data on violence against civilians in Africa, based partly on this script for scraping ACLED’s website:

acled.visualizer.20150728

When I started on this kick, I didn’t plan to stop writing blog posts about international affairs. As I’ve gotten into it, though, I’ve found that my curiosity about current events has ebbed, and the pilot light for my writing brain has gone out. Normally, writing ideas flare up throughout the day, but especially in the early morning. Lately, I wake up thinking about the coding problems I’m stuck on.

I think it’s a matter of attention, not interest. Programming depends on the tiniest details. All those details quickly clog the brain’s RAM, leaving no room for the unconscious associations that form the kernels of new prose. That clogging happens even faster when other parts of your life are busy, stressful, or off kilter, as they are for many of us, as as they are for me right now.

That’s what I think, anyway. Whatever the cause, though, I know that I’m rarely feeling the impulse to write, and I know that shift has sharply slowed the pace of publishing here. I’m leaving the channel open and hope I can find the mental and temporal space to keep using it, but who knows what tomorrow may bring?

ACLED in R

The Armed Conflict Location & Event Data Project, a.k.a. ACLED, produces up-to-date event data on certain kinds of political conflict in Africa and, as of 2015, parts of Asia. In this post, I’m not going to dwell on the project’s sources and methods, which you can read about on ACLED’s About page, in the 2010 journal article that introduced the project, or in the project’s user’s guides. Nor am I going to dwell on the necessity of using all political event data sets, including ACLED, with care—understanding the sources of bias in how they observe events and error in how they code them and interpreting (or, in extreme cases, ignoring) the resulting statistics accordingly.

Instead, my only aim here is to share an R script I’ve written that largely automates the process of downloading and merging ACLED’s historical and current Africa data and then creates a new data frame with counts of events by type at the country-month level. If you use ACLED in R, this script might save you some time and some space on your hard drive.

You can find the R script on GitHub, here.

The chief problem with this script is that the URLs and file names of ACLED’s historical and current data sets change with every update, so the code will need to be modified each time that happens. If the names were modular and the changes to them predictable, it would be easy to rewrite the code to keep up with those changes automatically. Unfortunately, they aren’t, so the best I can do for now is to give step-by-step instructions in comments embedded in the script on how to update the relevant four fields by hand. As long as the basic structure of the .csv files posted by ACLED doesn’t change, though, the rest should keep working.

[UPDATE: I revised the script so it will scrape the link addresses from the ACLED website and parse the file names from them. The new version worked after ACLED updated its real-time file earlier today, when the old version would have broken. Unless ACLED changes its file-naming conventions or the structure of its website, the version should work for the rest of 2015. In case it does fail, instructions on how to hard-code a workaround are included as comments at the bottom of the script.]

It should also be easy to adapt the part of the script that generates country-month event counts to slice the data even more finely, or to count by something other than event type. To do that, you would just need to add variables to the group_by() part of the block of code that produces the object ACLED.cm. For example, if you wanted to get counts of events by type at the level of the state or province, you would revise that line to read group_by(gwno, admin1, year, month, event_type). Or, if you wanted country-month counts of events by the type(s) of actor involved, you could use group_by(gwno, year, month, interaction) and then see this user’s guide to decipher those codes. You get the drift.

The script also shows a couple of examples of how to use ‘gglot2′ to generate time-series plots of those monthly counts. Here’s one I made of monthly counts of battle events by country for the entire period covered by ACLED as of this writing: January 1997–June 2015. A production-ready version of this plot would require some more tinkering with the size of the country names and the labeling of the x-axis, but the kind of small-multiples chart offers a nice way to explore the data before analysis.

Monthly counts of battle events, January 1997-June 2015

Monthly counts of battle events, January 1997-June 2015

If you use the script and find flaws in it or have ideas on how to make it work better or do more, please email me at ulfelder <at> gmail <dot> com.

The Dilemma of Getting to Civilian Control

A country can’t really qualify as a democracy without civilian control of its own security forces, but the logic that makes that statement true also makes civilian control hard to achieve, as events in Burkina Faso are currently reminding us.

The essential principle of democracy is popular sovereignty. Civilian control is fundamental to democracy because popular sovereignty requires that elected officials rule, but leaders of security forces—military and police—are not elected. Neither are the leaders of many other influential organizations, of course, but security forces occupy a special status in national affairs by virtue of their particular skills. To perform their designated roles, national rulers must determine and try to implement policies involving the collection of revenue and the production of public goods, including security. To do that, rulers need to wield the threat of coercion, and security forces supply that threat.

That necessity creates a dependency, and that dependency conveys power. In principle—and, historically, often in practice—leaders of security forces can use that dependency as leverage to bargain for bigger budgets or other policies they prefer for parochial reasons. Because those leaders are not held accountable to the public through elections, strong versions of that bargaining contravene the principle of popular sovereignty. Of course, security forces’ specific skills also make them uniquely capable of claiming political authority for themselves at any time. Military leaders rarely flex that muscle, but the threat of a coup only enhances their bargaining power with elected rulers, and thus further constrains popular sovereignty.

This logic implies that democracy only really obtains when state security forces reliably subordinate themselves to the authority of those elected civilian rulers. That arrangement seems clear in principle, but it turns out to be hard to achieve in practice. The problem is that the transition to civilian control demands that security forces concede their power. Organizations of all kinds are rarely excited about doing that, but it is especially hard for rulers to compel security forces to do it, because those forces are the stick those rulers would normally wield in that act of compellence. When pushed, military and police leaders can ask “Or what?” and civilian rulers will rarely have a strong answer. Under those circumstances, attempts to force the issue may have the opposite of their intended effect, provoking security forces into seizing power for themselves as a way to dispatch the civilian threat to their established position.

In short, the problem of getting to civilian control confronts civilian rulers with a dilemma: assert their authority and risk provoking a hard coup, or tolerate security forces’ continuing political power and accept what amounts to a perpetual soft coup.

This dilemma is bedeviling politics in Burkina Faso right now. Last fall, mass demonstrations in Burkina Faso triggered a military coup that toppled longtime autocratic ruler Blaise Compaoré. Under domestic and international pressure, the ensuing junta established a transitional process that is supposed to lead to democratic civilian rule after general elections on October 11, 2015.

Gen. Honore Nabere Traore leads a press conference on October 31, 2014, announcing that he would serve as president following Blaise Compaore's resignation (Photo: Theo Renault/AP)

Gen. Honore Nabere Traore leads an October 2014 press conference announcing that he would serve as president following Blaise Compaore’s apparent resignation. Traore was promptly supplanted by Lt. Col. Isaac Zida, who in November 2014 stepped aside for a civilian interim president, who then appointed Zida to the post of interim prime minister. (Photo: Theo Renault/Associated Press)

On paper, a civilian now rules Burkina Faso as interim president, but attempts to clarify the extent of the interim government’s power, and to circumscribe the role of certain security organs in Burkinabe politics, are generating the expected friction and heat. Here is how Alex Thurston described the situation on his Sahel Blog:

In recent weeks, NGOs and media outlets have buzzed with discussions of tension between the Presidential Security Regiment (RSP) and Prime Minister Yacouba Isaac Zida, a conflict that could, at worst, derail the transition. Although both Zida and Compaore belonged to the RSP in the past, the elite unit has reasons to fear that it will be disbanded and punished: in December, Zida called for its dismantling, and in February, a political crisis unfolded when Zida attempted to reshuffle the RSP’s officer corps (French).

The most recent crisis (French) involves suspicions in some quarters of the government that the RSP was planning to arrest Zida upon his return from a trip to Taiwan – suspicions that were serious enough to make Zida land at a military base instead of at the airport as planned (French). On June 29, the day after Zida got home, gendarmes in the capital questioned three RSP officers, including Lieutenant Colonel Céleste Coulibaly, about their involvement in the suspected plot. That evening, shots were heard coming from the RSP’s barracks, which sits behind the presidential palace. Rumors then spread that Zida was resigning under RSP pressure, but he quickly stated that he was not stepping down.

These incidents have passed without bloodshed, but they have raised fears of an RSP-led coup. For its part, the RSP says (French) that there are no plots, but that it wants Zida and other military officers, such as Minister of Territorial Administration and Security Auguste Barry, to leave the government (French). Both sides accuse the other of seeking to undermine the planned transition. Many observers now look to interim President Michel Kafando to mediate (French) between the parties.

In a recent briefing, the International Crisis Group (ICG) surveyed that landscape and argued in favor of deferring any clear decisions on the RSP’s status until after the elections. Thurston sympathizes with ICG’s view but worries that deferral of those decisions will produce “an atmosphere of impunity.” History says that Thurston is right to worry, but so is ICG. In other words, there are no obvious ways to climb down from the horns of this dilemma.

How Likely Is (Nuclear) War Between the United States and Russia?

Last week, Vox ran a long piece by Max Fisher claiming that “the prospect of a major war, even a nuclear war, in Europe has become thinkable, [experts] warn, even plausible.” Without ever clarifying what “thinkable” or “plausible” mean in this context, Fisher seems to be arguing that, while still unlikely, the probability of a nuclear war between the United States and Russia is no longer small and is rising.

I finished Fisher’s piece and wondered: Is that true? As someone who’s worked on a couple of projects (here and here) that use “wisdom of crowds” methods to make educated guesses about how likely various geopolitical events are, I know that one way to try to answer that question is to ask a bunch of informed people for their best estimates and then average them.

So, on Thursday morning, I went to SurveyMonkey and set up a two-question survey that asks respondents to assess the likelihood of war between the United States and Russia before 2020 and, if war were to happen, the likelihood that one or both sides would use nuclear weapons. To elicit responses, I tweeted the link once and posted it to the Conflict Research Group on Facebook and the IRstudies subreddit. The survey is still running [UPDATE: It’s now closed, because Survey Monkey won’t show me more than the first 100 responses without a paid subscription], but 100 people have taken it so far, and here are the results—first, on the risk of war:

wwiii.warrisk

And then on the risk that one or both sides would nuclear weapons, conditional on the occurrence of war:

wwiii.nukerisk

These results come from a convenience sample, so we shouldn’t put too much stock in them. Still, my confidence in their reliability got a boost when I learned yesterday that a recent survey of international-relations experts around the world asked an almost-identical question about the risk of a war and obtained similar results. In its 2014 survey, the TRIP project asked: “How likely is war between the United States and Russia over the next decade? Please use the 0–10 scale with 10 indicating that war will definitely occur.” They got 2,040 valid responses to that question, and here’s how they were distributed:

trip.warrisk

Those results are centered a little further to the right than the ones from my survey, but TRIP asked about a longer time period (“next decade” vs. “before 2020″), and those additional five years could explain the difference. It’s also important to note that the scales aren’t directly comparable; where the TRIP survey’s bins implicitly lie on a linear scale, mine were labeled to give respondents more options toward the extremes (e.g., “Certainly not” and “Almost certainly not”).

In light of that corroborating evidence, let’s assume for the moment that the responses to my survey are not junk. So then, how likely is a US/Russia war in the next several years, and how likely is it that such a war would go nuclear if it happened? To get to estimated probabilities of those events, I did two things:

  1. Assuming that the likelihoods implicit my survey’s labels follow a logistic curve, I converted them to predicted probabilities as follows: p(war) = exp(response – 5)/(1 + exp(response – 5)). That rule produces the following sequence for the 0–10 bins: 0.007, 0.018, 0.047, 0.119, 0.269, 0.500, 0.731, 0.881, 0.953, 0.982, 0.993.

  2. I calculated the unweighted average of those predicted probabilities.

Here are the estimates that process produced, rounded up to the nearest whole percentage point:

  • Probability of war: 11%
  • Probability that one or both sides will use nuclear weapons, conditional on war: 18%

To translate those figures into a single number representing the crowd’s estimate of the probability of nuclear war between the US and Russia before 2020, we take their product: 2%.

Is that number different from what Max Fisher had in mind when he wrote that a nuclear war between the US and Russia is now “thinkable,” “plausible,” and “more likely than you think”? I don’t know. To me, “thinkable” and “plausible” seem about as specific as “possible,” a descriptor that applies to almost any geopolitical event you can imagine. I think Max’s chief concern in writing that piece was to draw attention to a risk that he believes to be dangerously under-appreciated, but it would be nice if he had asked his sources to be more specific about just how likely they think this calamity is.

More important, is that estimate “true”? As Ralph Atkins argued in a recent Financial Times piece about estimating the odds of Grexit, it’s impossible to say. For unprecedented and at least partially unique events like these—an exit from the euro zone, or a nuclear war between major powers—we can never know the event-generating process well enough to estimate their probabilities with high confidence. What we get instead are summaries of peoples’ current beliefs about those events’ likelihood. That’s highly imperfect, but it’s still informative in its own way.

2015 Tour de France Predictions

I like to ride bikes, I like to watch the pros race their bikes, and I make forecasts for a living, so I thought it would be fun to try to predict the outcome of this year’s Tour de France, which starts this Saturday and ends on July 26. I’m also interested in continuing to explore the predictive power of pairwise wiki surveys, a crowdsourcing tool that I’ve previously used to try to forecast mass-killing onsets, coup attempts, and pro football games, and that ESPN recently used to rank NBA draft prospects.

So, a couple of weeks ago, I used All Our Ideas to create a survey that asks, “Which rider is more likely to win the 2015 Tour de France?” I seeded the survey with the names of 11 riders—the 10 seen by bookmakers at Paddy Power as the most likely winners, plus Peter Sagan because he’s fun to watchposted a link to the survey on Tumblr, and trolled for respondents on Twitter and Facebook. The survey got off to a slow start, but then someone posted a link to it in the r/cycling subreddit, and the votes came pouring in. As of this afternoon, the survey had garnered more than 4,000 votes in 181 unique user sessions that came from five continents (see the map below). The crowd also added a handful of other riders to the set under consideration, bringing the list up to 16.

tourdefrance.2015.votemap

So how does that self-selected crowd handicap the race? The dot plot below shows the riders in descending order by their survey scores, which range from 0 to 100 and indicate the probability that that rider would beat a randomly chosen other rider for a randomly chosen respondent. In contrast to Paddy Power, which currently shows Chris Froome as the clear favorite and gives Nairo Quintana a slight edge over Alberto Contador, this survey sees Contador as the most likely winner (survey score of 90), followed closely by Froome (87) and a little further by Quintana (80). Both sources put Vincenzo Nibali as fourth likeliest (73) and Tejay van Garderen (65) and Thibaut Pinot (51) in the next two spots, although Paddy Power has them in the opposite order. Below that, the distances between riders’ chances get smaller, but the wiki survey’s results still approximate the handicapping of the real-money markets pretty well.

tourdefrance.2015.scores

There are at least a couple of ways to try to squeeze some meaning out those scores. One is to read the chart as a predicted finishing order for the 16 riders listed. That’s useful for something like a bike race, where we—well, some of us, anyway—care not only who wins, but also where other will riders finish, too.

We can also try to convert those scores to predicted probabilities of winning. The chart below shows what happens when we do that by dividing each rider’s score by the sum of all scores and then multiplying the result by 100. The probabilities this produces are all pretty low and more tightly bunched than seems reasonable, but I’m not sure how else to do this conversion. I tried squaring and cubing the scores; the results came closer to what the betting-market odds suggest are the “right” values, but I couldn’t think of a principled reason to do that, so I’m not showing those here. If you know a better way to get from those model scores to well-calibrated win probabilities, please let me know in the comments.

tourdefrance.2015.winprobs

So that’s what the survey says. After the Tour concludes in a few weeks, I’ll report back on how the survey’s predictions fared. Meanwhile, here’s wishing the athletes a crash–, injury–, and drug–free tour. Judging by the other big races I’ve seen so far this year, it should be a great one to watch.

The Birth of Crowdsourcing?

From p. 106 of the first paperback edition of The Professor and the Madman, a slightly overwrought but enjoyable history of the origins of the Oxford English Dictionary, found on the shelf of a vacation rental:

The new venture that [Richard Chenevix] Trench seemed now to be proposing would demonstrate not merely the meaning but the history of meaning, the life story of each word. And that would mean the reading of everything and the quoting of everything that showed anything of the history of the words that were to be cited. The task would be gigantic, monumental, and—according to the conventional thinking of the times—impossible.

Except that here Trench presented an idea, an idea that—to those ranks of conservative and frock-coated men who sat silently in the [London Library] on that dank and foggy evening [in 1857]—was potentially dangerous and revolutionary. But it was the idea that in the end made the whole venture possible.

The undertaking of the scheme, he said, was beyond the ability of any one man. To peruse all of English literature—and to comb the London and New York newspapers and the most literate of the magazines and journals—must be instead “the combined action of many.” It would be necessary to recruit a team—moreover, a huge one—probably comprising hundreds and hundreds of unpaid amateurs, all of them working as volunteers.

The audience murmured with surprise. Such an idea, obvious though it may sound today, had never been put forward before. But then, some members said as the meeting was breaking up, it did have some real merit.

And here’s what that crowdsourcing process ended up looking like in practice:

[Frederick] Furnivall then issued a circular calling for volunteer readers. They could select from which period of history they would like to read books—from 1250 to 1526, the year of the New English Testament; from then to 1674, the year when Milton died; or from 1674 to what was then the present day. Each period, it was felt, represented the existence of different trends in the development of the language.

The volunteers’ duties were simple enough, if onerous. They would write to the society offering their services in reading certain books; they would be asked to read and make word-lists of all that they read, and would then be asked to look, super-specifically, for certain words that currently interested the dictionary team. Each volunteer would take a slip of paper, write at its top left-hand side the target word, and below, also on the left, the date of the details that followed: These were, in order, the title of the book or paper, its volume and page number, and then, below that, the full sentence that illustrated the use of the target word. It was a technique that has been undertaken by lexicographers to the present day.

Herbert Coleridge became the first editor of what was to be called A New English Dictionary on Historical Principles. He undertook as his first task what may seem prosaic in the extreme: the design of a small stack of oak-board pigeonholes, nine holes wide and six high, which could accommodate the anticipated sixty to one hundred thousand slips of paper that would come in from the volunteers. He estimated that the first volume of the dictionary would be available to the world within two years. “And were it not for the dilatoriness of many contributors,” he wrote, clearly in a tetchy mood, “I should not hesitate to name an earlier period.”

Everything about these forecasts was magnificently wrong. In the end more than six million slips of paper came in from the volunteers; and Coleridge’s dreamy estimate that it might take two years to have the first salable section of the dictionary off the presses—for it was to be sold in parts, to help keep revenues coming in—was wrong by a factor of ten. It was this kind of woefully naive underestimate—of work, of time, of money—that at first so hindered the dictionary’s advance. No one had a clue what they were up against: They were marching blindfolded through molasses.

So, even with all those innovations, this undertaking also produced a textbook example of the planning fallacy. I wonder how quickly and cheaply the task could have been completed with Mechanical Turk, or with some brush-clearing assistance from text mining?

A Skeptical Note on Policy-Prescriptive Political Science

My sometimes-colleague Michael Horowitz wrote a great piece for War on the Rocks last week on what “policy relevance” means for political scientists who study international affairs, and the different forms that relevance can take. Among the dimensions of policy relevance he calls out is the idea of “policy actionability”:

Policy actionability refers to a recommendation that is possible to implement for the target of the recommendation. Most academic work is not policy actionable, fundamentally. For example, implications from international relations research are things such as whether countries with high male-to-female ratios are more likely to start military conflicts or that countries that acquire nuclear weapons become harder to coerce.

As Michael notes, most scholarship isn’t “actionable” in this way, and isn’t meant to be. In my experience, though, there is plenty of demand in Washington and elsewhere for policy-actionable research on international affairs, and there is a subset of scholars who, in pursuit of relevance, do try to extract policy prescriptions from their studies.

As an empiricist, I welcome both of those things—in principle. Unfortunately, the recommendations that scholars offer rarely follow directly from their research. Instead, they almost always require some additional, often-heroic assumptions, and those additional assumptions render the whole endeavor deeply problematic. For example, Michael observes that most statistical studies identify average effects—other things being equal, a unit change in x is associated with some amount of change in y—and points out that the effects in any particular case will still be highly uncertain.

That’s true for a lot of what we study, but it’s only the half of it. Even more significant, I think, are the following three assumptions, which implicitly underpin the “policy implications” sections in a lot of the work on international affairs that tries to convert comparative analysis (statistical or not) into policy recommendations:

  • Attempts to induce a change in x in the prescribed direction will actually produce the desired change in x;
  • Attempts to induce a change in x in the prescribed direction will not produce significant and negative unintended consequences; and
  • If it does occur, a change in y induced by the policy actor to whom the scholar is making recommendations will have the same effect as previous changes in y that occurred for various other reasons.

The last assumption isn’t so problematic when the study in question looked specifically at policy actions by that same policy actor, but that’s almost never the case in international relations and other fields using observational data to study macro-political behavior. Instead, we’re more likely to have a study that looked at something like GDP growth rates, female literacy, or the density of “civil society” organizations that the policy audience does not control and does not know how to control. Under these circumstances, all three of those assumptions must hold for the research to be neatly “actionable,” and I bet most social scientists will tell you that at least one and probably two or three of them usually don’t.

With so much uncertainty and so much at stake, I wind up thinking that, unless their research designs have carefully addressed these assumptions, scholars—in their roles as scientists, not as citizens or advocates—should avoid that last mile and leave it to the elected officials and bureaucrats hired for that purpose. That’s hard to do when we care about the policies involved and get asked to offer “expert” advice, but “I don’t know” or “That’s not my area of expertise” will almost always be a more honest answer in these situations.

 

One Measure By Which Things Have Recently Gotten Worse

The United Nation’s refugee agency today released its annual report on people displaced by war around the world, and the news is bad:

The number of people forcibly displaced at the end of 2014 had risen to a staggering 59.5 million compared to 51.2 million a year earlier and 37.5 million a decade ago.

The increase represents the biggest leap ever seen in a single year. Moreover, the report said the situation was likely to worsen still further.

The report focuses on raw estimates of displaced persons, but I think it makes more sense to look at this group as a share of world population. The number of people on the planet has increased by more than half a billion in the past decade, so we might expect to see some growth in the number of forcibly displaced persons even if the amount of conflict worldwide had held steady. The chart below plots annual totals from the UNHCR report as a share of mid-year world population, as estimated by the U.S. Census Bureau (here).

unhcr.refugee.trends

The number of observations in this time series is too small to use Bayesian change point detection to estimate the likelihood that the upturn after 2012 marks a change in the underlying data-generating process. I’m not sure we need that kind of firepower, though. After holding more or less steady for at least six years, the share of world population forcibly displaced by war has increased by more than 50 percent in just two years, from about one of every 200 people to 1 of every 133 people. Equally important, reports from field workers indicate that this problem only continues to grow in 2015. I don’t think I would call this upturn a “paradigm change,” as UN High Commissioner for Refugees António Guterres did, but there is little doubt that the problem of displacement by war has worsened significantly since 2012.

In historical terms, just how bad is it? Unfortunately, it’s impossible to say for sure. The time series in the UNHCR report only starts in 2004, and a note warns that methodological changes in 2007 render the data before that year incomparable to the more recent estimates. The UNHCR describes the 2014 figure as “the highest level ever recorded,” and that’s technically true but not very informative when recording started only recently. A longer time series assembled by the Center for Systemic Peace (here) supports the claim that the latest raw estimate is the largest ever, but as a share of world population, it’s probably still a bit lower than the levels seen in the post–Cold War tumult of the early 1990s (see here).

Other relevant data affirm the view that, while clearly worsening, the intensity of armed conflict around the world is not at historically high levels, not even for the past few decades. Here is a plot of annual counts of battle-related deaths (low, high, and best estimates) according to the latest edition of UCDP’s data set on that topic (here), which covers the period 1989–2013. Note that these figures have not been adjusted for changes in world population.

Annual estimates of battle-related deaths worldwide, 1989-2013 (data source: UCDP)

Annual estimates of battle-related deaths worldwide, 1989-2013 (data source: UCDP)

We see similar pattern in the Center for Systemic Peace’s Major Episodes of Political Violence data set (second row here), which covers the whole post-WWII period. For the chart below, I have separately summed the data set’s scalar measure of conflict intensity for two types of conflict, civil and interstate (see the codebook for details). Like the UCDP data, these figures show a local increase in the past few years that nevertheless remains well below the prior peak, which came when the Soviet Union fell apart.

Annual intensity of political violence worldwide, 1946-2014 (data source: CSP)

Annual intensity of political violence worldwide, 1946-2014 (data source: CSP)

And, for longer-term perspective, it always helps to take another look at this one, from an earlier UCDP report:

PRIO battle death trends

I’ll wrap this up by pinning a note in something I see when comparing the shorter-term UCDP estimates to the UNHCR estimates on forcibly displaced persons: adjusting for population, it looks like armed conflicts may be killing fewer but displacing more than they used to. That impression is bolstered by a glance at UCDP data on trends in deaths from “intentional attacks on civilians by governments and formally organized armed groups,” which UCDP calls “one-sided violence” (here).  As the plot below shows, the recent upsurge in warfare has not yet produced a large increase in the incidence of these killings, either. The line is bending upward, but it remains close to historical lows.

Estimated annual deaths from one-sided violence, 1989-2013 (Source: UCDP)

Estimated annual deaths from one-sided violence, 1989-2013 (Source: UCDP)

So, in the tumult of the past few years, it looks like the rate of population displacement has surged while the rate of battle deaths has risen more slowly and the rate of one-sided violence targeting civilians hasn’t risen much at all. If that’s true, then why? Improvements in medical care in conflict zones are probably part of the story, but I wonder if changes in norms and values, and in the international institutions and practices instantiating them, aren’t also shaping these trends. Governments that in the past might have wantonly killed populations they regarded as threats now seem more inclined to press those populations by other means—not always, but more often. Meanwhile, international organizations are readier than ever to assist those groups under pressure by feeding and sheltering them, drawing attention to their miseries, and sometimes even protecting them. The trend may be fragile, and the causality is impossible to untangle with confidence, but it deserves contemplation.

From China, Another Strike Against Legitimacy

I’ve groused on this blog before (here and here) about the trouble with “legitimacy” as a causal mechanism in theories of political stability and change, and I’ve pointed to Xavier Marquez’s now-published paper as the most cogent expression of this contrarian view to date.

Well, here is a fresh piece of empirical evidence against the utility of this concept: according to a new Global Working Paper from Brookings, the citizens of China who have benefited the most from that country’s remarkable economic growth in recent decades are, on average, its least happy. As one of the paper’s authors describes in a blog post about their research,

We find that the standard determinants of well-being are the same for China as they are for most countries around the world. At the same time, China stands out in that unhappiness and reported mental health problems are highest among the cohorts who either have or are positioned to benefit from the transition and related growth—a clear progress paradox. These are urban residents, the more educated, those who work in the private sector, and those who report to have insufficient leisure time and rest.

These survey results contradict the “performance legitimacy” story that many observers use to explain how the Chinese Communist Party has managed to avoid significant revolutionary threats since 1989 (see here, for example). In that story, Chinese citizens choose not to demand political liberalization because they are satisfied with the government’s economic performance. In effect, they accept material gains in lieu of political voice.

Now, though, we learn that the cohort in which contentious collective action is most likely to emerge—educated urbanites—are also, on average, the country’s least happy people. The authors also report (p. 14) that, in China, “the effect of income increases on life satisfaction are limited.” A legitimacy-based theory predicts that the CCP is surviving because it is making and keeping its citizens happy; instead, we see that it is surviving in spite of deepening unhappiness among key cohorts.

To me, this case further bares the specious logic behind most legitimacy-based explanations for political continuity. We believe that rebellion is an expression of popular dissatisfaction, a kind of referendum in the streets; we observe stability; so, we reason backwards from the absence of rebellion to the absence of dissatisfaction, sprinkle a little normative dust on it, and arrive at a positive concept called legitimacy. Formally, this is a fallacy of affirmative conclusion from a negative premise: happy citizens don’t rebel, no rebellion is occurring, therefore citizens must be happy. Informally, I think it’s a qualitative version of the “story time” process in which statistical modelers often indulge: get a surprising result, then make up a richer explanation for it that feels right.

I don’t mean to suggest that popular attitudes are irrelevant to political stasis and change, or that the durability of specific political regimes has nothing to do with the affinity between their institutional forms and the cultural contexts in which they’re operating. Like Xavier, though, I do believe that the conventional concept of legitimacy is too big and fuzzy to have any real explanatory power, and I think this new evidence from China reminds us of that point. If we want to understand how political regimes persist and when they break down, we need to identify mechanisms that are more specific than this one, and to embed them in theories that allow for more complexity.

A Plea for More Prediction

The second Annual Bank Conference on Africa happened in Berkeley, CA, earlier this week, and the World Bank’s Development Impact blog has an outstanding summary of the 50-odd papers presented there. If you have to pick between reading this post and that one, go there.

One paper on that roster that caught my eye revisits the choice of statistical models for the study of civil wars. As authors John Paul Dunne and Nan Tian describe, the default choice is logistic regression, although probit gets a little playing time, too. They argue, however, that a zero-inflated Poisson (ZIP) model matches the data-generating process better than either of these traditional picks, and they show that this choice affects what we learn about the causes of civil conflict.

Having worked on statistical models of civil conflict for nearly 20 years, I have some opinions on that model-choice issue, but those aren’t what I want to discuss right now. Instead, I want to wonder aloud why more researchers don’t use prediction as the yardstick—or at least one of the yardsticks—for adjudicating these model comparisons.

In their paper, Dunne and Tian stake their claim about the superiority of ZIP to logit and probit on comparisons of Akaike information criteria (AIC) and Vuong tests. Okay, but if their goal is to see if ZIP fits the underlying data-generating process better than those other choices, what better way to find out than by comparing out-of-sample predictive power?

Prediction is fundamental to the accumulation of scientific knowledge. The better we understand why and how something happens, the more accurate our predictions of it should be. When we estimate models from observational data and only look at how well our models fit the data from which they were estimated, we learn some things about the structure of that data set, but we don’t learn how well those things generalize to other relevant data sets. If we believe that the world isn’t deterministic—that the observed data are just one of many possible realizations of the world—then we need to care about that ability to generalize, because that generalization and the discovery of its current limits is the heart of the scientific enterprise.

From a scientific standpoint, the ideal world would be one in which we could estimate models representing rival theories, then compare the accuracy of the predictions they generate across a large number of relevant “trials” as they unfold in real time. That’s difficult for scholars studying big but rare events like civil wars and wars between states; though; a lot of time has to pass before we’ll see enough new examples to make a statistically powerful comparison across models.

But, hey, there’s an app for that—cross-validation! Instead of using all the data in the initial estimation, hold some out to use as a test set for the models we get from the rest. Better yet, split the data into several equally-sized folds and then iterate the training and testing across all possible groupings of them (k-fold cross-validation). Even better, repeat that process a bunch of times and compare distributions of the resulting statistics.

Prediction is the gold standard in most scientific fields, and cross-validation is standard practice in many areas of applied forecasting, because they are more informative than in-sample tests. For some reason, political science still mostly eschews both.* Here’s hoping that changes soon.

* For some recent exceptions to this rule on topics in world politics, see Ward, Greenhill, and Bakke and Blair, Blattman, and Hartman on predicting civil conflict; Chadefaux on warning signs of interstate war; Hill and Jones on state repression; and Chenoweth and me on the onset of nonviolent campaigns.

Follow

Get every new post delivered to your Inbox.

Join 12,844 other followers

%d bloggers like this: