Here’s a paragraph, from a 2011 paper by Ian Lustick, that I really wish I’d written. It’s long, yes, but it rewards careful reading.
One might naively imagine that Darwin’s theory of the “origin of species” to be “only” about animals and plants, not human affairs, and therefore presume its irrelevance for politics. But what are species? The reason Darwin’s classic is entitled Origin of Species and not Origin of the Species is because his argument contradicted the essentialist belief that a specific, finite, and unchanging set of categories of kinds had been primordially established. Instead, the theory contends, “species” are analytic categories invented by observers to correspond with stabilized patterns of exhibited characteristics. They are no different in ontological status than “varieties” within them, which are always candidates for being reclassified as species. These categories are, in essence, institutionalized ways of imagining the world. They are institutionalizations of difference that, although neither primordial nor permanent, exert influence on the futures the world can take—both the world of science and the world science seeks to understand. In other words, “species” are “institutions”: crystallized boundaries among “kinds”, constructed as boundaries that interrupt fields of vast and complex patterns of variation. These institutionalized distinctions then operate with consequences beyond the arbitrariness of their location and history to shape, via rules (constraints on interactions), prospects for future kinds of change.
This is one of the big ideas to which I was trying to allude in a post I wrote a couple of months ago on “complexity politics”, and in an ensuing post that used animated heat maps to trace gross variations in forms of government over the past 211 years. Political regime types are the species of comparative politics. They are “analytic categories invented by observers to correspond with stabilized patterns of exhibited characteristics.” In short, they are institutionalized ways of thinking about political institutions. The patterns they describe may be real, but they are not essential. They’re not the natural contours of the moon’s surface; they’re the faces we sometimes see in them.
If we could just twist our mental kaleidoscopes a bit, we might find different things in the same landscape. One way to do that would be to use a different set of measures. For the past 20 years or so, political scientists have relied almost exclusively on the same two data sets—Polity and Freedom House’s Freedom in the World—to describe and compare national political regimes in anything other than prose. These data sets are very useful, but they are also profoundly conventional. Polity offers a bit more detail than Freedom House on specific features of national politics, but the two are essentially operationalizing the same assumptions about the underlying taxonomy of forms of government.
Given that fact, it’s hard to see how further distillations of those data sets might surprise us in any deep way. A new project called Varieties of Democracy (V-Dem) promises to bring fresh grist to the mill by greatly expanding the number of institutional elements we can track, but it is still inherently orthodox. Its creators aren’t trying to reinvent the taxonomy; they’re looking to do a better job locating individuals in the prevailing one. That’s a worthy and important endeavor, but it’s not going to produce the kind of gestalt shift I’m talking about here.
New methods of automated text analysis just might. My knowledge of this field is quite limited, but I’m intrigued by the possibilities of applying unsupervised learning techniques, such as latent Dirichlet allocation (LDA), to the problem of identifying political forms and associating specific cases with them. In contrast to conventional measurement strategies, LDA doesn’t oblige us to specify a taxonomy ahead of time and then look for instances of the things in it. Instead, LDA assumes there is an infinite mixture of overlapping but latent categories out there, and these latent categories are partially revealed by characteristic patterns in the ways we talk and write about the world.
Unsupervised learning is still constrained by the documents we choose to include and the language we use in them, but it should still help us find patterns in the practice of politics that our conventional taxonomies overlook. I hope to be getting some funding to try this approach in the near future, and if that happens, I’m genuinely excited to see what we find.