Question: Consistency checks on results files?
0
gravatar for russhh
18 months ago by
russhh5.5k
UK, U. Glasgow
russhh5.5k wrote:

When I send results out to my colleagues, I'm a bit worried that any subsequent work on a project may change those results - particularly when those results are being followed up by subsequent benchwork. "Hi, could you send me the top 10 hits from experiment XXX". Yeah no problem. Can I send them to you again when I've refactored this bit of my script, or when I've added this seemingly independent feature to my program?

At present I don't do any consistency checks on my results files; and was wondering what approaches are used _out there_?

Do you md5sum every results file and raise a note when those values change for example?

Do you have a continually-updated results silo on dropbox or similar, and let your colleagues pull results from there.

Do you diff & log before updating any results file?

ADD COMMENTlink modified 18 months ago by Devon Ryan96k • written 18 months ago by russhh5.5k
1
gravatar for Devon Ryan
18 months ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

If things are changing then you should be giving those results a different name so it's obvious what came from what (this should also be kept in your snakefiles or whatever you're using for performing an analysis). In other words, don't actually update the files unless you don't care at all about what they used to contain.

ADD COMMENTlink written 18 months ago by Devon Ryan96k

I'm happy for results to change and feel that this will be an inevitable part of an evolving research project. What I'd like to be aware of is when things are changing when I don't expect them to (refactoring of my code; updating my packages / environment), and when results that I've previously sent to colleagues have changed due to altered requirements / bug fixes etc.

ADD REPLYlink written 18 months ago by russhh5.5k

Right, but all of that (software versions and such) should be static to a given project.

ADD REPLYlink modified 18 months ago • written 18 months ago by Devon Ryan96k

I'm not sure I agree. Certainly the analysis code / environments / dependencies will be static once the project is mothballed; but during it's active development, these things will necessarily change.

ADD REPLYlink written 18 months ago by russhh5.5k

In my experience software versions rarely if ever change during the course of a project. Otherwise it's a pain to keep track of which version produced which result.

ADD REPLYlink written 18 months ago by Devon Ryan96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 862 users visited in the last hour