Question: Consistency checks on results files?
0
gravatar for russhh
10 months ago by
russhh4.9k
UK, U. Glasgow
russhh4.9k wrote:

When I send results out to my colleagues, I'm a bit worried that any subsequent work on a project may change those results - particularly when those results are being followed up by subsequent benchwork. "Hi, could you send me the top 10 hits from experiment XXX". Yeah no problem. Can I send them to you again when I've refactored this bit of my script, or when I've added this seemingly independent feature to my program?

At present I don't do any consistency checks on my results files; and was wondering what approaches are used _out there_?

Do you md5sum every results file and raise a note when those values change for example?

Do you have a continually-updated results silo on dropbox or similar, and let your colleagues pull results from there.

Do you diff & log before updating any results file?

ADD COMMENTlink modified 10 months ago by Devon Ryan93k • written 10 months ago by russhh4.9k
1
gravatar for Devon Ryan
10 months ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

If things are changing then you should be giving those results a different name so it's obvious what came from what (this should also be kept in your snakefiles or whatever you're using for performing an analysis). In other words, don't actually update the files unless you don't care at all about what they used to contain.

ADD COMMENTlink written 10 months ago by Devon Ryan93k

I'm happy for results to change and feel that this will be an inevitable part of an evolving research project. What I'd like to be aware of is when things are changing when I don't expect them to (refactoring of my code; updating my packages / environment), and when results that I've previously sent to colleagues have changed due to altered requirements / bug fixes etc.

ADD REPLYlink written 10 months ago by russhh4.9k

Right, but all of that (software versions and such) should be static to a given project.

ADD REPLYlink modified 10 months ago • written 10 months ago by Devon Ryan93k

I'm not sure I agree. Certainly the analysis code / environments / dependencies will be static once the project is mothballed; but during it's active development, these things will necessarily change.

ADD REPLYlink written 10 months ago by russhh4.9k

In my experience software versions rarely if ever change during the course of a project. Otherwise it's a pain to keep track of which version produced which result.

ADD REPLYlink written 10 months ago by Devon Ryan93k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1010 users visited in the last hour