Measurement Is Hard, Especially of Politics, and Everything Is Political

If occasional readers of this blog remember only one thing from their time here, I’d like it to be this: we may be getting better at measuring political things around the world, but huge gaps remain, sometimes on matters that seem basic or easy to see, and we will never close those gaps completely.

Two items this week reminded me of this point. The first came from the World Bank, which blogged that only about half of the countries they studied for a recent paper had “adequate” data on poverty. As a chart from an earlier World Bank blog post showed, the number of countries suffering from “data deprivation” on this topic has declined since the early 1990s, but it’s still quite large. Also notice that the period covered by the 2015 study ends in 2011. So, in addition to “everywhere”, we’ve still got serious problems with the “all the time” part of the Big Data promise, too.

The other thing that reminded me of data gaps was a post on the Lowy Institute’s Interpreter blog about Myanmar’s military, the Tatmadaw. According to Andrew Selth,

Despite its dominance of Burma’s national affairs for decades, the Tatmadaw remains in many respects a closed book. Even the most basic data is beyond the reach of analysts and other observers. For example, the Tatmadaw’s current size is a mystery, although most estimates range between 300,000 and 350,000. Official statistics put Burma’s defence expenditure this year at 3.7 % of GDP, but the actual level is unknown.

This kind of situation may be especially pernicious. It looks like we have data—350,000 troops, 3.7 percent of GDP—but the subject-matter expert knows that those data are not reliable. For those of us trying to do cross-national analysis of things like conflict dynamics or coup risk, the temptation to plow ahead with the numbers we have is strong, but we shouldn’t trust the inferences we draw from them.

The size and capability of a country’s military are obviously political matters. It’s not hard to imagine why governments might want to mislead others about the true values of those statistics.

Measuring poverty might seem less political and thus more amenable to technical fixes or workarounds, but that really isn’t true. At each step in the measurement process, the people being observed or doing the observing may have reasons to obscure or mislead. Survey respondents might not trust their observers; they may fear the personal or social consequences of answering or not answering certain ways, or just not like the intrusion. When the collection is automated, they may develop ways to fool the routines. Local officials who sometimes oversee the collection of those data may be tempted to fudge numbers that affect their prospects for promotion or permanent exile. National governments might seek to mislead other governments as a way to make their countries look stronger or weaker than they really are—stronger to deter domestic and international adversaries or get a leg up in ideological competitions, or weaker to attract aid or other help.

As social scientists, we dream of data sets that reliably track all sorts of human behavior. Our training should also make us sensitive to the many reasons why that dream is impossible and, in many cases, undesirable. Measurement begets knowledge; knowledge begets power; and struggles over power will never end.

A Research Note on Updating Coup Forecasts

A new year is about to start, and that means it’s time for me to update my coup forecasts (see here and here for the 2013 and 2012 editions, respectively). The forecasts themselves aren’t quite ready yet—I need to wait until mid-January for updates from Freedom House to arrive—but I am making some changes to my forecasting process that I thought I would go ahead and describe now, because the thinking behind them illustrates some important dilemmas and new opportunities for predictions of many kinds of political events.

When it comes time to build a predictive statistical model of some rare political event, it’s usually not the model specification that gives me headaches. For many events of interest, I think we now have a pretty good understanding of which methods and variables are likely to produce more accurate forecasts.

Instead, it’s the data, or really the lack thereof, that sets me to pulling my hair out. As I discussed in a recent post, things we’d like to include in our models fall into a few general classes in this regard:

  • No data exist (fuggeddaboudit)
  • Data exist for some historical period, but they aren’t updated (“HA-ha!”)
  • Data exist and are updated, but they are patchy and not missing at random (so long, some countries)
  • Data exist and are updated, but not until many months or even years later (Spinning Pinwheel of Death)

In the past, I’ve set aside measures that fall into the first three of those sets but gone ahead and used some from the fourth, if I thought the feature was important enough. To generate forecasts before the original sources updated, I either a) pulled forward the last observed value for each case (if the measure was slow-changing, like a country’s infant mortality rate) or b) hand-coded my own updates (if the measure was liable to change from year to year, like a country’s political regime type).

Now, though, I’ve decided to get out of the “artisanal updating” business, too, for all but the most obvious and uncontroversial things, like which countries recently joined the WTO or held national elections. I’m quitting this business, in part, because it takes a lot of time and the results may be pretty noisy. More important, though, I’m also quitting because it’s not so necessary any more, thanks to  timelier updates from some data providers and the arrival of some valuable new data sets.

This commitment to more efficient updating has led me to adopt the following rules of thumb for my 2014 forecasting work:

  • For structural features that don’t change much from year to year (e.g., population size or infant mortality), include the feature and use the last observed value.
  • For variables that can change from year to year in hard-to-predict ways, only include them if the data source is updated in near-real time or, if it’s updated annually, if those updates are delivered within the first few weeks of the new year.
  • In all cases, only use data that are publicly available, to facilitate replication and to encourage more data sharing.

And here are some of the results of applying those rules of thumb to the list of features I’d like to include in my coup forecasting models for 2014.

  • Use Powell and Thyne’s list of coup events instead of Monty Marshall’s. Powell and Thyne’s list is updated throughout the year as events occur, whereas the publicly available version of Marshall’s list is only updated annually, several months after the start of the year. That wouldn’t matter so much if coups were only the dependent variable, but recent coup activity is also an important predictor, so I need the last year’s updates ASAP.
  • Use Freedom House’s Freedom in the World (FIW) data instead of Polity IV to measure countries’ political regime type. Polity IV offers more granular measures of political regime type than Freedom in the World, but Polity updates aren’t posted until spring or summer of the following year, usually more than a third of the way into my annual forecasting window.
  •  Use IMF data on economic growth instead of the World Bank’s. The Bank now updates its World Development Indicators a couple of times a year, and there’s a great R package that makes it easy to download the bits you need. That’s wonderful for slow-changing structural features, but it still doesn’t get me data on economic performance as fast as I’d like it. I work around that problem by using the IMF’s World Economic Outlook Database, which include projections for years for which observed data aren’t yet available and forecasts for several years into the future.
  • Last but not least, use GDELT instead of UCDP/PRIO or Major Episodes of Political Violence (MEPV) to measure civil conflict. Knowing which countries have had civil unrest or violence in the recent past can help us predict coup attempts, but the major publicly available measures of these things are only updated well into the year. GDELT now represents a nice alternative. It covers the whole world, measures lots of different forms of political cooperation and conflict, and is updated daily, so country-year updates are available on January 2. GDELT’s period of observation starts in 1979, so it’s still a stretch to use it models of super-rare events like mass-killing onsets, where the number of available examples since 1979 on which to train is still relatively small. For less-rare events like coup attempts, though, starting the analysis around 1980 is no problem. (Just don’t forget to normalize them!) With some help from John Beieler, I’m already experimenting with adding annual GDELT summaries to my coup forecasting process, and I’m finding that they do improve the model’s out-of-sample predictive power.

In all of the forecasting work I do, my long-term goals are 1) to make the forecasts more dynamic by updating them more frequently (e.g., monthly, weekly, or even daily instead of yearly) and 2) to automate that updating process as much as possible. The changes I’m making to my coup forecasting process for 2014 don’t directly accomplish either of these things, but they do take me a few steps in both directions. For example, once GDELT is in the mix, it’s possible to start thinking about how to switch to monthly or even daily updates that rely on a sliding window of recent GDELT tallies. And once I’ve got a coup data set that updates in near-real time, I can imagine pinging that source each day to update the counts of coup attempts in the past several years. I’m still not where I’d like to be, but I think I’m finally stepping onto a path that can carry me there.

There Are Two Kinds of Countries in the World: _____ and _____

A few days ago, Sean Langberg blogged about a subject that’s long been a pet peeve of mine: how we classify countries when we try to talk about the international system, and the labels we apply to the resulting groups. I thought I’d take the cue to air my grievances on the topic and make a couple of simple suggestions.

Taxonomies require organizing principles, and the kernel of the classification system Americans usually use in international politics comes from modernization theory. Modernization theory’s core idea is the teleological one that economic growth, urbanization, industrialization, and political democracy are the natural, desirable, and mutually reinforcing ends of social change, or “development” for short. Viewed through this lens, some wealthy, democratic countries appear to have arrived already, while the rest are playing catch-up. In other words, the former have “developed,” while the latter are still “developing.”

This conventional approach is plainly displayed in the International Monetary Fund’s (IMF) semi-annual World Economic Outlook reports, which sort countries into two bins: “advanced” and “emerging and developing.” The former includes the U.S., Canada, Europe, Australia and New Zealand, and a smattering of richer Asian countries, while the latter is, simply, everyone else. What, exactly, distinguishes these two groups is left unspecified–according to the April 2012 report, “This classification is not based on strict criteria, economic or otherwise, and it has evolved over time”–but the basic divide is the familiar one between the “West” and “the rest.” The First World vs. Third World tags have largely faded from use since the Second World disappeared in the early 1990s, but the underlying concept is the same.

What’s so distasteful about the conventional approach are its connotations of hierarchy and even moral superiority. A couple dozen countries, mostly “white” and European, are described as having reached the desired end state, while the rest of the world struggles and strains to catch up. The rich and powerful have matured; a few fortunate others are just now emerging from backwardness; and the rest remain retarded in their development.

There are other ways to do this. Back when Marxism was still alive and kicking, some social scientists used it to divide the world into a “center” and a “periphery” defined by the economic exploitation and political subjugation of the latter by the former. Dubbed dependency theory, this scheme died a bitter death for empirical, political, and sociological reasons. Empirically, dependency theory couldn’t really explain how some once-peripheral countries eventually got much richer in spite of their supposed subjugation. Politically, the import-substitution policies dependency theorists prescribed were a bust. Sociologically, dependency theory got tagged (with justification) as part of a wider leftist political project, so it was further deflated by the ideological and practical collapse of Communism in the late 1980s. All of that said, dependency theory did present a reasoned alternative to the neoliberal scheme it opposed, and, in so doing, it spotlighted some important realities of the international system.

Some have tried to classify countries along religious or cultural lines, but I think these attempts have generally been less successful. The most prominent expression of this approach in the U.S. comes from Samuel Huntington’s “clash of civilizations” writings, in which he argued that the fundamental sources of conflict between states in the post-Cold War world would be cultural rather than ideological or economic. This thesis seems to find some echoes in the Global War on Terror, but critics have rightfully taken Huntington to task for reducing the fantastic diversity and rapidly-evolving cultural constellations of so many countries to a single, simple identity defined primarily by their dominant religions.

More generally, I wonder if the distinction between sacred and secular generally means that states aren’t the relevant units for global taxonomies based on religion. Perhaps clans, families, or souls would be more fitting. Ongoing attempts by some Muslims to establish a caliphate imply that it is at least theoretically possible to sort international political units into insider and outsider groups based on religious practice, but the fact that these groupings generally contain one or zero countries should tell us something about their disutility.

For comparing countries, wealth seems like a perfectly good yardstick, in no small part because national wealth is so tightly linked to the forms of power that drive contemporary international relations. But then why not talk about money instead of this fuzzier idea of development? This is what the World Bank does nowadays, and its low-income, middle-income, and high-income designations–based strictly on gross national income (GNI) per capita–would seem to offer more analytical leverage than the IMF’s “developed” vs. “emerging” distinction without all the ugly baggage. The Economist takes this approach, too, and seems no worse for it.

For people concerned about the broader package of liberal constructs–the values and institutional forms that most authors probably have in mind when they refer to the “West”–why not make those criteria explicit and be more transparent about how they are measured? Observers who are primarily interested in domestic politics might consider the organization of a country’s political economy to compare it with others. This could be done by considering procedures to select national leaders on the one hand and prevailing sources of wealth generation on the other. Meanwhile, people who are more interested in the organization of the international system could look explicitly at formal and informal entanglements among states to identify relevant communities in a way that escapes the tired and broken bifurcations of East vs. West and North vs. South.

Whatever your preferred solution, I beg you, please, stop, stop, STOP referring to countries as “developed” and “developing.” And if you find that you must, at least put those awful labels in quotes.

%d bloggers like this: