We were easily able to replicate Rauchhaus’ key findings in Stata, but couldn’t get it to work in R. It took us a long while to work out why, but the reason turned out to be an error in Stata: Stata was finding a solution when it shouldn’t have (because of separation in the data). This solution, as we show in the paper, was wrong – and led Rauchhaus’ paper to overestimate the effect of nuclear weapons on conflict by a factor of several million.
That’s the money quote from a new post on the blog Political Science Replication by MIT grad students Mark Bell and Nick Miller. This post is a great case study in the scientific importance of replication, and it comes wrapped in great professional advice on how to do and present replications well.
After many years of doing statistical work without making replication materials available, I’m now trying to abide by these norms in work I’m doing for the Holocaust Museum’s Center for the Prevention of Genocide and on my blog. So far, I’m finding that:
- Making the guts of your research accessible to others takes some additional work up front, but the amount of additional labor involved declines steeply once you internalize this norm and start building projects with replication in mind from the get-go;
- Restrictions on data-sharing because of licenses or scholars’ embargoes are still a serious impediment to replication in political science, and data-makers who continue to impose these kinds of restrictions risk making their products irrelevant in a world where replication is becoming standard practice; and
- Once you make the gestalt shift to thinking of your research as not really your own but part of a larger scientific endeavor that doesn’t really give a crap about you personally, making your work accessible for replication stops being so scary and can even be downright exhilarating.