In recent years, scientists may have inadvertently given up on a key component of the scientific method: reproducibility. That’s an argument that’s being advanced by a number of people who have been tracking our increasing reliance on computational methods in all areas of science. An apparently simple computerized analysis may now involve a complex pipeline of software tools; reproducing it will require version control for both software and data, along with careful documentation of the precise parameters used at every step. Some researchers are now getting concerned that their peers simply aren’t up to the challenge, and we need to start providing the legal and software tools to make it easier for them.
During my discussions with colleagues and friends, I have found several reasons which are given for this state of affairs.
Some are of the opinion that even if the code along with data are given, some computations are so costly that peers are really not going to check it (A point also made by John Hawks in his post). While it may be true to some extent, even if one grad student somewhere decides to take a look at the code, run it to try and reproduce the results, it still is worth the effort.
The next one is the space constraint; one can not give out all the details of the code and data — this is where online publication of the supplementary information is very helpful.
The third (and, in my opinion the most important — at least in the computational community of which I am a part) is the fact that many groups in the world, when they develop a code, keep it as a property of the group — codes for solving some families of partial differential equations using specific techniques or carrying our optimization, for example, belong to this category. Generations of grad students keep adding modules to the main part of the code to solve interesting problems. In such a case, giving away the code means that the person who put (more often than not) huge efforts to write the crux loses out. Further, unlike experimental research, once a code for solving some problem is available, the effort involved in adding modules to solve other similar but significant (from the point of view of generating publishable results) are marginal. So, there is really no incentive for the person who first developed the code to give it away — even if for peer review. This problem can only be overcome if code sharing of this sort is respected by the community, and is taken into account while assessing the contributions of any researcher — or, in other words, it can only be overcome if the scientific community comes up with practices which would encourage such code sharing.