Reupping:Why reproducibility initiatives are misguided

I’m reposting this two-year old piece, because it’s worth reminding ourselves why exact replication has, with minor exceptions, never been an important part of science:

In my latest Pacific Standard column, I take a look at the recent hand-wringing over the reproducibility of published science. A lot of people are worried that poorly done, non-reproducible science is ending up in the peer-reviewed literature.

Many of these worries are misguided. Yes, as researchers, editors, and reviewers we should do a better job of filtering out bad statistical practices and poor experimental designs; we should also make sure that data, methods, and code are thoroughly described and freely shared. To the extent that sloppy science is causing a pervasive reproducibility problem, then we absolutely need to fix it.

But I’m worried that the recent reproducibility initiatives are going beyond merely sloppy science, and instead are imposing a standard on research that is not particularly useful and completely ahistorical. When you see a hot new result published in Nature, should you expect other experts in the field to be able reproduce it exactly? Continue reading “Reupping:Why reproducibility initiatives are misguided”

Sloppiness vs Reproducibility

I’m not a big fan of reproducibility projects. Shoddy papers shouldn’t be tolerated, but the truth is that sometimes rigorously done research isn’t reproducible — and when that happens, science gets interesting. It should go without saying that a peer-reviewed paper isn’t a guarantee of truth. If done properly, a paper is a record of a rigorous attempt to discover something about the world, no more, no less. What we believe about nature should reflect the accumulated evidence of many researchers and many papers, and that means the scientific literature should reflect our latest tentative, bleeding-edge thinking, even at the risk of being wrong. It’s counterproductive to hold up publication until some other lab reproduces your result, or to retract papers that don’t hold up, unless they had clear methodological flaws or artifacts that should have been caught in review.

Two recent articles capture what I think is the right attitude on reproducibility. First, as David Allison and his colleagues write, as a community of researchers, editors, and reviewers, we’re not doing as well as we should be when it comes to meeting high standards for best statistical and other methodological practices:

 In the course of assembling weekly lists of articles in our field, we began noticing more peer-reviewed articles containing what we call substantial or invalidating errors. These involve factual mistakes or veer substantially from clearly accepted procedures in ways that, if corrected, might alter a paper’s conclusions.

There is no excuse for this kind of sloppiness.

On the other hand, here is Columbia’s Stuart Firestein:

The failure to replicate a part or even the whole of an experiment is not sufficient for indictment of the initial inquiry or its researchers. Failure is part of science. Without failures there would be no great discoveries.

So yes, let’s clean up science by rooting out obvious “invalidating practices” that all too often plague papers in journals at all tiers. But let’s not be naive about how science works, and what the scientific literature is supposed to be. To paraphrase what  I wrote recently, if some of our studies don’t turn out to be wrong, than we’re not pushing hard enough at the boundaries of our knowledge.

How to advance science by failure

Stewart Firestein has a provocative piece in Nautilus on the role of failing well in science:

As your career moves on and you have to obtain grant support you naturally highlight the successes and propose experiments that will continue this successful line of work with its high likelihood of producing results. The experiments in the drawer get trotted out less frequently and eventually the drawer just sticks shut. The lab becomes a kind of machine, a hopper—money in, papers out.

My hope of course is that things won’t be this way for long. It wasn’t this way in the past, and there is nothing at all about science and its proper pursuit that requires a high success rate or the likelihood of success, or the promise of any result. Indeed, in my view these things are an impediment to the best science, although I admit that they will get you along day to day. It seems to me we have simply switched the priorities. We have made the easy stuff—running experiments to fill in bits of the puzzle—the standard for judgment and relegated the creative, new ideas to that stuck drawer. But there is a cost to this. I mean a real monetary cost because it is wasteful to have everyone hunting in the same ever-shrinking territory…

How will this change? It will happen when we cease, or at least reduce, our devotion to facts and collections of them, when we decide that science education is not a memorization marathon, when we—scientists and nonscientists—recognize that science is not a body of infallible work, of immutable laws and facts. When we once again recognize that science is a dynamic and difficult process and that most of what there is to know is still unknown.

The hard…is what makes it great

There are a lot of things to love in this piece from Christie Aschwanden about why retractions, studies that don’t hold up to reproduction, and even sub-fraudulent “p-hacking” do not mean that science is broken, but it is, simply, very hard. Among those things are the great visuals from Ritchie King – including a fun “p-hacking” demonstration tool.

For me, the real take home message goes beyond the “science is hard” catchphrase. Science isn’t just hard in the way implied by Tom Hanks’ Jimmy Duggan character in A League of Their Own:

It’s supposed to be hard. If it wasn’t hard, everyone would do it. The hard… is what makes it great.

Contrary to the rhetoric that would portray “science is hard” as an endorsement of success over a monumentally difficult task, this is not the point.

As Ashwanden addresses, science is hard because it is messy and complicated and requires a communal effort from members of a species that is only dubiously social outside of relatively narrow local groups.

If we’re going to rely on science as a means for reaching the truth — and it’s still the best tool we have — it’s important that we understand and respect just how difficult it is to get a rigorous result.

There are things like sampling variance and mistakes and uncontrollable environmental variables and resource limits and the fabled “orthologous methods” that inject all sorts of inconsistency and challenges into the textbook scientific method. This is why the great philosophers of science* spoke about disproof rather than proof, about independent reproducibility, about probability rather than certainty.

These issues do not indicate that science is broken. There simply is no other way it could work in the hands of mere humans. What may be broken is the way we perceive science. We need to understand that it is a gradual and a community effort. We need to understand that our mythos of science – of the great, usually in the stories, man performing a great experiment and making a great discovery – are almost always false summaries which are convenient and inspiring, but do not represent why science is truly hard.

*It is also why those who dismiss the philosophy of science as a waste of time – I’m looking at you Neil DeGrasse Tyson – deserve nothing but the most vigorous of side-eyes on that point.

The Cancer Reproducibility Project is Incredibly Naive, Probably Useless, and Potentially Damaging

I’ve always thought the Reproducibility Project represented an incredibly naive approach to the scientific method. This excellent news piece in Science sums up many of the reasons why. As Richard Young says in the piece, “I am a huge fan of reproducibility. But this mechanism is not the way to test it.” Here’s why:

1) Reproducibility in science is not achieved by having a generic contract research organization replicate a canned protocol, for good reason: cutting edge experiments are often very difficult and require specialized skills to get running. Replication is instead achieved by other labs in the field who want to build on the results. Sometimes this is done using the same protocol as the original experiment, and sometimes by obtaining similar results in a different system using a different method.

2) For this reason, I don’t have much confidence that the results obtained by the Reproducibility Project will accurately reflect the state of reproducibility in science. A negative result could mean many things — and most likely it will reflect a failure of the contract lab and not an inherent problem with the result. Contrary to the claims of the projects leaders, the data produced by the Project will probably not be useful to people who are serious about estimating the scope of irreproducibility in science. At its worst, it could be extremely misleading by painting an overly negative picture of the state of science. It’s already been damaging by promoting a too-naive view of how the process of successful science actually works.

3) As the Science piece points out, there is a much better, cheaper, and scientifically sensible way to achieve better reproducibility. If many papers out there are suspect because they lack proper controls, don’t use validated reagents, fail to describe methods adequately, or rely on flawed statistics, then we don’t need to spend millions of dollars and thousands of hours of effort trying to repeat experiments. We need to make sure editors and reviewers require proper controls, reagents, statistics, and full methods descriptions.

It’s worth reading the full article, but below the fold are some salient quotes: Continue reading “The Cancer Reproducibility Project is Incredibly Naive, Probably Useless, and Potentially Damaging”