Question: Physical Lab Notebook for Computational Work
5
gravatar for sviatoslav.kendall
6.4 years ago by
United States
sviatoslav.kendall770 wrote:

Today my lab informed me that I need to keep a physical record of all the computational work I do. For context: I work in medical school in New York that does translational medicine. 

From reading this forum, I know that there are electronic notebooks out there such as http://ipython.org/notebook.html and this is something I intend to look into more, but I don't have anyone in my lab to talk to about best practices. Im pretty new to keeping a lab notebook in the first place; I understand that the guiding principle is "Can someone else re-create your work with only your notebook as a reference?"

My specific concerns are: 

All of my code has pretty decent documentation so I could quickly print that out but does that make it intellectual property of the lab? Can I re-use it later if I change jobs or potentially sell it? What, if anything, should I do about "ownership" of my code beyond putting my name in the documentation?

A bunch of my work has been Perl scripting for purposes of re-formatting Excel spread sheets; do you think I should print out before-and-after versions of the Excel files I've modified or is that overkill? Is it normal for multi-page spreadsheets to get printed out and stapled into lab notebooks?

I've produced numerous R graphics for exploratory reasons; should I print all of those out with notes explaining how they informed my thought process or would it be good enough to just print out the ones that make their way into a paper submission or a poster or something? 

Some of the soft-ware I use is web-based and requires a subscription to access; should I be concerned about revealing their intellectual property if I take screen shots and explain how I used the soft-ware?

 

I know that some of the answers to my questions might depend on specifics I haven't supplied. At this point, I'm just interested in getting an idea of how other bioinformaticians keep notebooks and what are some good resources I can use to learn more about the best practices. Thank you. 

Update: 
Having spoken to my bosses again, their main concerns can be summarized by the following two quotes:

- "A lab notebook is a legally-binding document which records everything you did in the lab; if its not written down it didn't happen"

- "At some point in the future, we want to know what we did and why"

notebook lab R record keeping • 2.6k views
ADD COMMENTlink modified 6.4 years ago by seidel7.1k • written 6.4 years ago by sviatoslav.kendall770
3

see Possible Electronic Lab Notebook Systems:

ADD REPLYlink written 6.4 years ago by Pierre Lindenbaum131k
2

Regarding the IP of "your" software, if you wrote it while employed by the lab then it's already not your IP (technically, the university is even the author, since it's a work for hire).

ADD REPLYlink written 6.4 years ago by Devon Ryan97k

I imagine things can get pretty complicated legally if you work for multiple institutions and recycle code a lot.

ADD REPLYlink written 6.4 years ago by sviatoslav.kendall770

It depends on your institution, position (Graduate Student versus Post-Doc), and your particular supervisor.

ADD REPLYlink written 6.4 years ago by DG7.2k

At least in the US, it's purely a function of your contract type. Graduate students and post-docs are both considered employees, so it's a work-for-hire situation (meaning the employer owns all IP unless there's a separate clause in the employment contract specifying otherwise...which would be pretty unusual). This may be completely different in other countries.

In any case, most places will be understanding about letting you take things with you. It's when they start seeing dollar signs that problems arise.

ADD REPLYlink written 6.4 years ago by Devon Ryan97k
2
gravatar for dariober
6.4 years ago by
dariober11k
WCIP | Glasgow | UK
dariober11k wrote:

I work on different projects for different people and I'm quite happy with redmine to keep track of what I do, in particular I use the Wiki pages. For archiving scripts and programs I'd suggest familiarizing with a version control tool like subversion. A bit off-topic, I would also set up a database (I like PostgreSQL) to archive samples, libraries etc.

All these tools have some overhead to set up and manage, whether it is worth the effort depends on your specific case. Just my 2p...

ADD COMMENTlink written 6.4 years ago by dariober11k
1
gravatar for Ryan Dale
6.4 years ago by
Ryan Dale4.9k
Bethesda, MD
Ryan Dale4.9k wrote:

f the goal is to be able to reproduce your work, then that seems nearly impossible to do strictly from a physical representation. I'm just imagining the process of transcribing code from hundreds of printed out pieces of paper into a text editor, and all the errors that would create.

In practice, I don't use a physical lab notebook. My primary tool is a git repository for each project, chock full of documentation. And I make sure to redundantly back them up.

But maybe you could tie together a physical notebook and digital resources by using version control, only printing out final-ish results, and referring to commits in your notebook. For example you could add a printed-out plot to the notebook with text like "this figure, from plotting.R in commit a1385df1, shows . . ."

ADD COMMENTlink modified 9 months ago by RamRS30k • written 6.4 years ago by Ryan Dale4.9k
1
gravatar for seidel
6.4 years ago by
seidel7.1k
United States
seidel7.1k wrote:

Something which is not currently clear to me from your question is: how much documentation do you currently produce? Do you effectively document not only your code, but how and why you used it for various purposes, i.e. for each of your projects? Is your current method of recording your exposition organized, secure, stable, and accessible by others? If so, then you have no problem. You may just need some easy way to print out summaries that reference your electronic system in some way. You might be able to get away with using some form of version control as others have suggested, by which you can refer to a checksum or version number in your printed summary.

The quotes from your bosses tell me that's it's less an issue of dead trees, and more an issue of making sure everything is effectively documented.

I know many bioinformaticists who generate countless files without ever writing down why, or what they were thinking. Somehow, they think they don't have to. Your quote: "I've produced numerous R graphics for exploratory reasons; should I print all of those out with notes explaining how they informed my thought process or would it be good enough to just print out the ones that make their way into a paper..." begs the question, did you actually write down how they informed your thought process? If so, what method do you currently use? I'm sure it can be adapted to satisfy your bosses somehow. On the other hand, if you have not written down (typed up) any exposition to go along with your exploratory figures - then this is the problem.

Consider a bench scientist asking the same question: "Do I have to write down all my experiments? Or just the ones that make it into the paper?" And this, in my opinion, marks a major difference in training between bench scientists and informatics scientists. Bench scientists are trained to keep notebooks in which they write down everything, because they don't know what will be important later - therefore it's all important. Less than 1% of that stuff ever gets published - this has no relationship to whether or not it should be written down. The same is true for informatics people, whether or not something gets published has nothing to do with whether it gets written down. It should always get written down - because of those boss comments.

Three elements to your problem: (1) write it down, (2) use a system so that it's accessible by yourself and others, (3) satisfy the "legally-binding" part (which may include things like checksums and witnessing). Personally I don't think it's necessary that everything you do has to exist in some printed form. If you're doing (1), then (2) is a way to prove to others that (1) is occurring; and if (1) and (2) are occurring, your bosses will be assured, and satisfying (3) is adapting your system for legal protection - which could be simple.

As to method (2), there is a lot of previous information available:

and many of those could have some convention applied resulting in a printed document (that may heavily reference the electronic stuff). There are also some systems which lend themselves to a printed form (i.e. literate programming that can result in a printed or other form of document): r knitr, emacs org mode, etc.

ADD COMMENTlink modified 9 months ago by RamRS30k • written 6.4 years ago by seidel7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1596 users visited in the last hour