Computations: not reproducibility friendly!

Via John Hawks, I got a link to this piece by John Timmer at Ars Technica:

In recent years, scientists may have inadvertently given up on a key component of the scientific method: reproducibility. That’s an argument that’s being advanced by a number of people who have been tracking our increasing reliance on computational methods in all areas of science. An apparently simple computerized analysis may now involve a complex pipeline of software tools; reproducing it will require version control for both software and data, along with careful documentation of the precise parameters used at every step. Some researchers are now getting concerned that their peers simply aren’t up to the challenge, and we need to start providing the legal and software tools to make it easier for them.

During my discussions with colleagues and friends, I have found several reasons which are given for this state of affairs.

Some are of the opinion that even if the code along with data are given, some computations are so costly that peers are really not going to check it (A point also made by John Hawks in his post). While it may be true to some extent, even if one grad student somewhere decides to take a look at the code, run it to try and reproduce the results, it still is worth the effort.

The next one is the space constraint; one can not give out all the details of the code and data — this is where online publication of the supplementary information is very helpful.

The third (and, in my opinion the most important — at least in the computational community of which I am a part) is the fact that many groups in the world, when they develop a code, keep it as a property of the group — codes for solving some families of partial differential equations using specific techniques or carrying our optimization, for example, belong to this category. Generations of grad students keep adding modules to the main part of the code to solve interesting problems. In such a case, giving away the code means that the person who put (more often than not) huge efforts to write the crux loses out. Further, unlike experimental research, once a code for solving some problem is available, the effort involved in adding modules to solve other similar but significant (from the point of view of generating publishable results) are marginal. So, there is really no incentive for the person who first developed the code to give it away — even if for peer review. This problem can only be overcome if code sharing of this sort is respected by the community, and is taken into account while assessing the contributions of any researcher — or, in other words, it can only be overcome if the scientific community comes up with practices which would encourage such code sharing.

4 Responses to “Computations: not reproducibility friendly!”

  1. VS Says:

    As an ex-experimentalist turned simulationist, I agree with what you say. Experimentalists are fond of boring you with details of how they synthesised their glass samples.. They know you cant do it without specialised equipment. Simulationists thrive on secrecy!

    I suppose the tools for simulationists are pretty free, that people are wary of releasing information. As you mentioned, algorithms can be implemented fast, and even inefficient coding can be compensated by powerful computer resources. It is something else to sit and read a bunch of articles and choose and verify an appropriate algorithm.

    V. Sitaram

  2. VS Says:

    Buddha’s statement on secrecy needs an update now!

    V. Sitaram

    • Guru Says:

      Dear Sitaram,

      Thanks for drawing my attention to Buddha on secrecy: for the readers of this blog, who might not be aware of his statement (as I was till your pointer came), here it is.

  3. Reproducibility: the key to good research « Entertaining Research Says:

    […] oriented papers. You should also take a look at this page that the post links to. Here are my own thoughts on reproducibility in computational research (and, there are some interesting stuff in the comments section of that post […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: