Google has created an automatic translation tool that is unlike all others. It is not based on the intellectual presuppositions of early machine translation efforts – it isn’t an algorithm designed only to extract the meaning of an expression from its syntax and vocabulary. In fact, at bottom, it doesn’t deal with meaning at all. Instead of taking a linguistic expression as something that requires decoding, Google Translate (GT) takes it as something that has probably been said before. It uses vast computing power to scour the internet in the blink of an eye, looking for the expression in some text that exists alongside its paired translation. Drawing on the already established patterns of matches between these millions of paired documents, GT uses statistical methods to pick out the most probable acceptable version of what’s been submitted to it. Much of the time, it works.
The emphasis in that quote is mine. The point is that GT operates through pattern recognition, comparing the current “problem” to vast numbers of past examples in order to identify not the single correct answer, but the answer that is most likely. If I’m not mistaken, this strategy is similar to the one IBM uses with Watson, the “supercomputer” that wowed us last year by beating a few top Jeopardy! champions at their own game.
The fantastic success of these pattern-recognition programs in other domains makes me think again about how the same strategy might be used to predict political events. (Apparently, Uncle Sam is doing the same; see this recent solicitation from the Director of National Intelligence’s IARPA shop.) So why haven’t we heard about any big breakthroughs in forecasting political events using these same techniques? If pattern-recognition programs can knock down the Tower of Babel and steamroll Jeopardy!‘s greatest champions, can’t they also tell us where the next riot or coup attempt is going to happen?
Attempts to use pattern recognition to predict politics face two major obstacles. One of them, we can (and maybe already have) overcome; the other, however, we cannot.
The first hurdle has to do with building libraries of examples. For pattern recognition to work well, we need to have lots of processed examples in which patterns can be identified for comparison to new cases. Traditionally, political scientists trying to apply this strategy to political forecasting have looked for patterns in sequences and combinations of events, and the event data on which those searches depended was coded by humans reading news stories. This approach was labor intensive and costly, and to my knowledge it has never produced any great results.
The growth of the World Wide Web and improvements in computer hardware and software may finally have solved this problem. Programs have been developed that can now produce reliable event data automatically at a fraction of the cost of the traditional human-coding efforts. (For examples, see here, here, and here.) Improvements in automated content analysis are also allowing researchers to break out of the event-analysis box and think about other aspects of relevant texts that might also contain signals about the future. (This post of mine from a few days ago describes one such effort.)
Even as our ability to process the historical record takes great leaps, however, there’s still a second hurdle that no software script can overcome. The more fundamental problem is that most political events of interest occur very rarely. On average, there are fewer than 10 coup attempts each year in countries worldwide. In the past two decades, countries have gone to war with each other only a handful of times. Over the past half-century, the number of transitions to democracy that have occurred is more like 100 than 1,000 or 10,000. In the entire history of the United States, there have only been 56 presidential elections.
The rarity of these events give pattern-recognition techniques very little to train on. We can now produce fantastic quantities of data in which to search for patterns, but we still have very few examples from which those patterns might be identified. The power of statistical analysis depends, in large part, on the size of the sample from which we’re trying to extrapolate. Without stacks and stacks of relevant historical examples, pattern-recognition techniques will forever struggle to get the traction they need to produce accurate predictions of political phenomena.