Ignorance Is Not Always Bliss

Contrary to the views of some skeptics, I think that political science deserves the second half of its name, and I therefore consider myself to be a working scientist. The longer I’ve worked at it, though, the more I wonder if that status isn’t as much a curse as a blessing. After more than 20 years of wrestling with a few big questions, I’m starting to believe that the answers to those questions are fundamentally unknowable, and permanent ignorance is a frustrating basis for a career.

To see what I’m getting at, it’s important to understand what I take science to be. In a book called Ignorance, neurobiologist Stuart Firestein rightly challenges the popular belief that science is a body of accumulated knowledge. Instead, Firestein portrays scientists as explorers—“feeling around in dark rooms, bumping into unidentifiable things, looking for barely perceptible phantoms”—who prize questions over answers.

Working scientists don’t get bogged down in the factual swamp because they don’t care all that much for facts. It’s not that they discount or ignore them, but rather that they don’t see them as an end in themselves. They don’t stop at the facts; they begin there, right beyond the facts, where the facts run out.

What differentiates science from philosophy is that scientists then try to answer those questions empirically, through careful observation and experimentation. We know in advance that the answers we get will be unreliable and impermanent—“The known is never safe,” Firestein writes; “it is never quite sufficient”—but the science is in the trying.

The problem with social science is that it is nearly always impossible to do the kinds of experimentation that would provide us with even the tentative knowledge we need to develop a fresh set of interesting questions. It’s not that experimentation is impossible; it isn’t, and some social scientists are working hard to do them better. Instead, as Jim Manzi has cogently argued, the problem is that it’s exceptionally difficult to generalize from social-scientific experiments, because the number and complexity of potential causes is so rich, and the underlying system, if there even is such a thing, is continually evolving.

This problem is on vivid display in a recent Big Think blog post in which eight researchers identified as some of the world’s “top young economists” identify what they see as their discipline’s biggest unanswered questions. The first entry begins with the sentence, “Why are developing countries poor?” The flip side of that question is, of course, “Why are rich countries rich?”, and if you put those two questions together, you get “What makes some economies grow faster than others?” That is surely the most fundamental riddle of macroeconomics,  and yet the sense I get from empirical economists is that, after centuries of inquiry, we still really just don’t know.

My own primary field of political development and democratization suffers from the same problem. After several decades of pondering why some countries have democratic governments while others don’t, the only thing we really know is that we still don’t know. When we pore over large data sets, we see a few strong correlations, but those correlations can’t directly explain the occurrence of relevant changes in specific cases. What’s more, so many factors are so deeply intertwined with each other that it’s really impossible to say which causes which. When we narrow our focus to clusters of more comparable cases—say, the countries of Eastern Europe after the collapse of Communism—we catch glimpses of things that look more like causal mechanisms, but the historical specificity of the conditions that made those cases comparable ensures that we can never really generalize even those ephemeral inferences.

It’s tempting to think that smarter experimentation will overcome or at least ameliorate this problem, but on broad questions of political and economic development, I’m not buying it. Take the question of whether or not U.S.-funded programs aimed at promoting democracy in other countries actually produce the desired effects. This sounds like a problem amenable to experimental design–what effect does intervention X have on observable phenomenon Y?–but it really isn’t. Yes, we can design and sometimes even implement randomized controlled trials (RCTs) to try to evaluate the impacts of individual interventions under specific conditions. As Jennifer Gauck has convincingly argued, however, it’s virtually impossible to get clear answers to the original macro-level questions from the micro-level analyses RCTs on this topic must entail when the micro- to macro- linkages are themselves unknown. Add thick layers of politicization, power struggles, and real-time learning, and it’s hard to see how even well-designed RCTs can push us off of old questions onto new ones.

I’m not sure where this puts me. To be honest, I increasingly wonder if my attraction to forecasting has less to do with the lofty scientific objective of using predictions to hone theories and more to do with the comfort of working on a more tractable problem. I know I can never really answer the big questions, and my attempts to do so sometimes leave me feeling like I’m trying to bail out the ocean, pouring one bucket at a time onto the sand in hopes of one day catching a glimpse of the contours of the floor below. By contrast, forecasting at least provides a yardstick against which I can assess the incremental value of specific projects. On a day-to-day basis, the resulting sense (illusion?) of progress provides a visceral feeling of accomplishment and satisfaction that is missing when I offer impossibly uncertain answers to deeper questions of cause and effect. And, of course, the day-to-day world is the one I actually have to inhabit.

I’d really like to end this post on a hopeful note, but today I’m feeling defeated. So, done.

Assessing the Risks of Risk Assessment

Tuesday’s Washington Post reports that a U.S. government task force now recommends “men should no longer receive a routine blood test to check for prostate cancer because the test does more harm than good.”

After reviewing the available scientific evidence, the task force concluded that such testing will help save the life of just one in 1,000 men. At the same time, the test steers many more men who would never die of prostate cancer toward unnecessary surgery, radiation and chemotherapy, the panel concluded. For every man whose life is saved by PSA testing, another one will develop a dangerous blood clot, two will have heart attacks, and 40 will become impotent or incontinent because of unnecessary treatment, the task force said in a statement Monday.

This recommendation will sound familiar to any American who was within earshot of a TV or radio a few years ago, when the same task force updated its guidance on breast cancer to recommend against routine screening for women in their 40s. That recommendation raised a ruckus in some quarters, and the new guidance on prostate-cancer screening may do the same.

Whatever you think of them, these recommendations are useful reminders that applied risk assessment can have a downside. Because no screening system works perfectly, attempts to identify high-risk cases will always flag some low-risk cases by mistake. In statistical jargon, these mistakes are called “false positives.” For mathematical reasons, the rarer the condition–or, in many policy contexts, the rarer the unwanted event–the larger the number of false positives you can expect to incur for every “true positive,” or correct warning.

This inevitable imprecision is the crux of the problem. To assess the value of any risk-assessment system, we have to compare its benefits to its costs. In the plus column, we have the expected benefits of early intervention: lives saved, suffering averted, crimes preempted, and the like. In the minus column, we have not only the costs of building and operating the screening system, but also the harmful effects of preventive action in the false positives. The larger the ratio of false positives to true positives, the larger these costs loom.

In the case of prostate cancer, epidemiologists can produce sharp estimates for each of those variables and arrive at a reasonably confident judgment about the net value of routine screening. With political risks like coups or mass killings, however, that’s a lot harder to do.

For one thing, it’s often not clear in the political realm what form preventive action should take, and some of the available forms can get pretty expensive. Diplomatic pressure is not especially costly, but things like large aid projects, covert operations, and peace-keeping forces often are.

What’s more, the preventive actions available to policy-makers often have uncertain benefits and are liable to produce unintended consequences. Aid projects sometimes distort local markets or displace local producers in ways that prolong suffering instead of alleviating it. Military interventions aimed at nipping threats in the bud may wind up expanding the problem by killing or angering bystanders and spurring “enemy” recruitment. Support for proxy forces can intensify conflicts instead of resolving them and may distort post-conflict politics in undesirable ways. The list goes on.

If a screening system were perfectly accurate, the costs of those unintended consequences would only accrue to interventions in true positives, and we could weigh them directly against the expected benefits of preventive action. In the real world, though, where false positives usually outnumber true positives by a large margin, there often won’t be any preventive benefits to counterbalance those unintended consequences. When we unwittingly intervene in a false positive, we get all of the costs and none of the prevention.

Improvements in the accuracy of our risk assessments can shrink this problem, but they can’t eliminate it. Even the most accurate early-warning system will never be precise enough to eliminate false positives, and with them the problem of costly intervention in cases that didn’t need it.

We also know that social scientists still don’t understand the dynamics of the political and economic systems they study nearly well enough to speak with confidence about the likely effects and side-effects of specific interventions. (That’s to say, we shouldn’t speak with great confidence on cause and effect, but that doesn’t stop many of us from doing so anyway.) As Jim Manzi surmises in a brilliant 2010 essay, the problem is that, with social phenomena, “the number and complexity of potential causes of the outcome of interest”–what Manzi calls “causal density”–is fantastically high, and the counterfactuals required to untangle those causal threads are rarely available. As a result,

At the moment, it is certain that we do not have anything remotely approaching a scientific understanding of human society. And the methods of experimental social science are not close to providing one within the foreseeable future. Science may someday allow us to predict human behavior comprehensively and reliably. Until then, we need to keep stumbling forward with trial-and-error learning as best we can.

In short, we’re stuck in a world of imprecise early warnings and persistent uncertainty about the consequences of the interventions we might  undertake in response to those imprecise warnings. It’s like trying to practice medicine with a grab bag of therapies and nothing but observational studies of one small population to guide choices about who needs them when, and what happens when they get them.

So what’s an empiricist to do? It’s tempting to throw up our hands and just say “fuggedaboudit,” but, as PM observes in a recent post at Duck of Minerva, “The alternative to good social science is not no social science. It’s bad social science.” In the absence of systematic risk assessment and cautious inferences about the consequences of various interventions, we won’t forego risk assessment and preventive action. Instead, we’ll stumble ahead with haphazard risk assessment and interventions driven by anecdote or ideology. Confronted with this choice, I’ll take fuzzy knowledge over willful ignorance any day.

That said, I do think the breadth of our uncertainty in these areas obliges us to concentrate our preventive efforts on two kinds of interventions: 1) ones that we understand well (e.g., vaccinations against infectious diseases), and 2) ones that are so small and simple that any side-effects will be inherently limited.

It’s tempting to think that bigger interventions will yield bigger benefits, but the benefits of these big schemes are often unproven, and the unintended consequences are likely to be larger as well (Exhibit A: U.S.-funded road-building in Afghanistan). There are a lot of ways that international politics isn’t like medicine, but the ethical concept of “First, do no harm” is undoubtedly relevant to both.

  • Author

  • Follow me on Twitter

  • Follow Dart-Throwing Chimp on WordPress.com
  • Enter your email address to follow this blog and receive notifications of new posts by email.

    Join 13,609 other subscribers
  • Archives

%d bloggers like this: