Should research code be released as part of the peer review process?
So there have been a few reactions to my latest post on accountable research software, including a Tweeter kerfuffle (again). Ever notice how people come out really aggressive on Twitter? Must the the necessity to compress ideas into 140chars. You can’t just write “Interesting point you make there, sir. Don’t you think that your laudable goal would be better served by adopting the following methodolo…” Oops, ran out of characters. OK, let’s just call him an asshole: seven characters used. Move on.
What I will try to do here is compile the various opinions expressed about research software, its manner of publication and accountability. I will also attempt to explain what my opinion is on the matter. I do not think mine is the only acceptable one. As this particular subject is based on values, my take is subject to my experiential baggage, as it were.
Back to business.
I do worry a little about one of the justifications given for distributing research code—the need to replicate experiments. A proper replication for a computational method is not running the same code over again (and thus making the same mistakes), but re-implementing the method independently. Having access to the original code is then useful for tracking down discrepancies, as it is often the case that the good results of a method are due to something quite different from what the original researchers thought. I fear that the push to have highly polished distributable code for all publications will result in a lot less scientific validation of methods by reimplementation, and more “ritual magic” invocation of code that no one understands. I’ve seen this already with code like DSSP, which almost all protein structure people use for identifying protein secondary structure with almost no understanding of what DSSP really does nor exactly how it defines H-bonds. It does a good enough job of identifying secondary structure, so no one thinks about the problems.
Kevin presents what to some may seem a radical opinion: not how to make research software accountable, but whether we should make it available in the first place. This seemingly goes against everything that scientists should stand for: transparency and the sharing of resources. He points out two possible dangers: the one to actual reproducibility, and the other to the role of bioinformaticians:
I fear that the push for polished code from researchers is an attempt to replace computational researchers with software publishing teams. The notion is that the product of the research is not the ideas and the papers, but just free code for others to use. It treats bioinformaticians as servants of “real” researchers, rather than as researchers in their own right. It’s like demanding that no papers on possible drug leads be published until Phase III trials have been completed (though not quite that expensive), and then that the drug be distributed for free
Kevin’s post got me thinking that perhaps not all research software should be released, at least not as part of the Methods section (and hence the peer-review phase of the paper) and also that perhaps research software, as we write it in the lab, is not all intended for release. My own concern is that, there might be unintended consequences in mandating code release during peer-review as a condition for publication. One such consequence might be that imperfect code (and research code is imperfect by its very nature of being highly prototypical) may frustrate referees to the point that they will not be able to properly run and assess it; and as they cannot ask for support, the publication will suffer. Also, installation is time-consuming — burdening referees with installing & testing software might just cause them to turn down papers that are mandatorily accompanied by code. The nascent Bioinformatics Testing Consortium does offer a solution to this problem, by having the code go through a hardening cycle prior to submission. But even then labs can only spend so much time and effort cleaning up, documenting and hardcoding their software. Labs that can afford to bring their research code up to hardcoding and documentation standards would be in a better position to publish than those which cannot. Is that bad? It may be. Because it is only in some cases (I’ll get to that) that robust, well-documented code is actually needed to review a paper. In many cases, code release during review is superfluous, and the effort of bringing it up to standards may unfairly impact labs whose manpower is already stretched. If the Methods section of the paper contain the description and equations necessary for replication of research, that should be enough in many cases, perhaps accompanied by code release post-acceptance. Exceptions do apply. One notable exception would be if the paper is mostly a methods paper, where the software — not just the algorithm — is key. Mostly, that is done already in journals like NAR, Bioinformatics and BMC Bioinformatics where there are such papers, and software is reviewed along with the manuscript. Another exception would be the paper Titus Brown and Jonathan Eisen wrote about: where the software is so central and novel, that not peer-reviewing it along with he paper makes the assessment of the paper’s findings impossible.
Better unsuported code than no code?
Following my previous post I was asked several times whether releasing unsupported code better than no code at all? Isn’t something better than nothing? Intuitively the answer seems obvious: release the code and let others deal with it, as some information is better than no information. I don’t subscribe to that though. When it comes to code release, documentation and support are part of the package. A lab doing less than that will be negatively impacted, as anyone releasing seemingly shoddy work may be. Again, the lab notebook analogy: when writing up the methods section in the paper, you write up the relevant part from the pages that worked, not the 90% of false starts that your lab notebook contains.
So how about taking the scripts that work, put them in the pipeline you used, and release that? Would that not be the equivalent of taking the relevant bits from your lab notebook and releasing them? Maybe. But as any programmer will tell you, the documentation, process, and even semi-hardening of the code to handle input contingencies takes a lot of time and effort. Again, we see imperfect software all around us, even (especially?) that software which we pay for. That’s why in software development there are alpha & beta phases, release cycles, documentation, upgrades etc. If your code does not compile 3 months down the line (which can be even before paper publication) because it is incompatible with the current libc release, are you responsible for changing it? Or should anyone wanting to use your code be forced to keep a double set of libraries, which is a pain to manage? There are many cases of scientific software that works “just so” with old libraries and compilers simply because the labs that released them cannot afford to adjust compatibility.
There are several problems associated with releasing code as part of the peer-review process. I am not sure we have solutions quite yet. This postwas supposed to be a response to some of the concerns raised, but I seem to gravitate back to the BTC, (disclosure: I’m a member) which at this point seems to be the only practical approach offered to those cases when code should be needed at the review stage. However, as I tried to point out in this
ramble post, this may not necessarily always be a good thing, and should be carefully considered.