One of the most important and influential research programs in comparative politics in my professional lifetime depends on data that are, in my view, far too flimsy to support the inferential edifices we keep trying to build with them.
I’m talking about research on the relationship between economic inequality and democracy. This topic is hardly new–Karl Marx had some important things to say about it in the mid-1800s–but interest in the subject was renewed in the early 2000s with the publication of books by comparativist Carles Boix (2003) and economists Daron Acemoglu and James Robinson (2006). Drawing intellectual inspiration from Marxist political sociology, both books casts politics as, at its roots, a struggle between rich and poor over the distribution (or redistribution) of wealth. The poor want more of it, but they have a hard time getting and staying organized enough to take it from the rich, who can usually use their wealth and power to dispel or repel any challenges. When the poor finally do manage to organize a credible and formidable threat, however, wealthy elites may offer democratic government as a form of compromise, allowing them to concede the redistribution of some wealth without having their assets seized or suffering the costs of a long fight.
Boix and Acemoglu & Robinson identify several factors that contribute to the relevant actors’ strategies, but the one around which a major research program has emerged is economic inequality. According to Boix, democratic transitions are most likely to occur when inequality is low. In Acemoglu & Robinson’s model, democratic transitions are most likely when inequality is either very low or very high. Whichever model we use, though, the implication is that democracy emerges as a strategic concession to pressures on the haves from the have-nots under conditions that are specific enough to test, provided we have the requisite data.
These authors’ theoretical models are explicitly intended to explain hundreds of years’ worth of institutional stability and change in all parts of the world, and their work has inspired many new and interesting research projects in comparative politics. When I started attending academic conferences in the mid-2000s, this topic seemed to be gulping down most of the intellectual oxygen in the field of comparative democratization. Whole panels were devoted to the topic, usually more than one per conference, and I was often told that my statistical analyses which excluded inequality (see here and here for examples) were incomplete. Some of the projects spawned by this burst of activity have produced articles that have appeared in the discipline’s most influential journals, including one in the most recent issue of the American Political Science Review.
Here’s the problem, though: Democratic transitions are rare events. So, to test the broad historical claims these authors make, we need reliable measures of economic inequality from a large number of countries for long periods of time. Coarse measures would suffice if the relevant theories were only concerned with gross and static variations in inequality, but they’re not. These theories are meant to be dynamic, and they posit that modest differences or changes in the degree of inequality can have significant effects.
The measures of economic inequality we actually have, however, are nowhere near that good. To accurately measure economic inequality, we need to observe variation in assets, income, or consumption at the individual or household level. (See this paper for a careful discussion of different ways to measure inequality.) That kind of observation can only happen through well-designed surveys or carefully kept tax records. Everything else is guesstimation, often with very wide confidence intervals. Of course, household-level surveys rarely happen in poor countries, and they hardly happened anywhere until fairly recently in human history. Poor countries also tend to have poor tax records, and even the records in wealthy countries are sometimes suspect. We also know that some dictatorships simply don’t share this kind of data with the outside world–Cuba and North Korea are still black holes in major cross-national economic data sets–and when they do, the validity of the reported values is often suspect.
These problems are all clearly reflected in the gaps and confidence measures in the leading source of data on this topic, the World Bank’s Measuring Income Inequality Database (a.k.a. Deininger & Squire). Browsing the data in country-year format, it’s easy to see that many countries (e.g., Afghanistan) have few or no observations; countries generally come online as they get richer (e.g., Latin America in the latter half of the 20th century); and where poor countries are included, the data are often marked as unreliable. In one paper on the topic, Christian Houle notes that the Deininger & Squire dataset includes observations for just 10% of all country-years during the period 1950-2001. Ten percent! And that’s just for the most recent half-century. Other scholars have attempted to improve on those data–see here for one prominent effort–but no alchemy can spin reliable measures from thin air.
In short, there’s a systematic relationship between the existence and quality of our observations of inequality and the very outcomes we’re trying to explain. For statistical analysis that’s meant to generate causal inferences, this is the worst kind of problem to have.
Given that problem, it’s hard for me to understand how the field of comparative politics has come to take the results of these studies so seriously. If we want to stick to cases where we have reliable measures of inequality, we have to limit our analysis to recent decades in richer countries, where there’s little or no variation on the dependent variable. What we can’t and never will be able to do with confidence–because no one can go back in time or reconstruct surveys or records that never existed–is a global analysis of the relationship between income inequality and political instability in the 19th and 20th centuries. Maybe the requisite data will become available to study this relationship in poorer societies of the future, but the past is mostly lost to us.
This hasn’t stopped many from trying, but the flimsy data on which those studies are usually based makes me wonder how we’ve come to consider the results to be much more than intriguing curiosities. I understand and agree that this is a really interesting and important question. One of the frustrating things about being a social scientist, though, is that there are often important questions to which we simply can’t provide clear answers. I believe this is one of those questions, and I hope this post has convinced a few of you of that, too.