Forum:Paper: Ten simple rules for biologists initiating a collaboration with computer scientists
3
5
Entering edit mode
3.9 years ago
JC 13k

Just read this https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008281

I believe that is why Bioinformatics/Computational Biology exists

Thoughts?

article opinion • 1.7k views
ADD COMMENT
0
Entering edit mode

Let me list the rules here for easier reference:

  1. Do not try to turn them into biologists
  2. Do not judge knowledge gaps
  3. Learn how computers store data and format information in a computationally friendly way
  4. Describe your data in a way that facilitates collaboration
  5. Learn if you should speed up your code (and how)
  6. Think about concepts and the audience, not the programming language
  7. Aim for transparency and reproducibility, share your code at every stage
  8. Understand strengths and limitations of each discipline
  9. Communicate how the results of your collaboration will be validated
  10. Do not get involved in wars over what the “real” science is

The author also hangs out here as Biomonika (Noolean)

ADD REPLY
2
Entering edit mode
3.9 years ago

It's ok. Would have preferred that she explicitly discuss "Metadata for biologists" and "data cleaning" instead of dancing around the subject. I think she needed to establish data, code, pipelines, notebooks, papers in a more rigid fashion for biologists to understand how to structure things for a computer scientist or statistician. The whole thing had a bit of a college essay feel but I'm happy to see Istvan and Biostars mentioned by name.

ADD COMMENT
0
Entering edit mode

Hah! I am very pleased that this made into it. I have always made a strong point in the classroom of how absurd is to tie "reproducibility" to a specific version of a software. Even beyond that, if results only hold when using TopHat but not HiSat perhaps that outcome is dubious to begin with (unless there is a documented functionality not implemented in the other software)

ADD REPLY
0
Entering edit mode

Hard agree on that! If something is reproducible, it shouldn't matter which software you use to show it. I might not go as far as to say that what holds true with HiSat should hold true with Tophat (which is clearly documented to just not work as well). But what holds true with HiSat should hold true to STAR etc.

ADD REPLY
0
Entering edit mode

that is the essence of the robustness survey I am conducting Paid opportunity: need microbiome experts for a survey and test of robustness

ADD REPLY
1
Entering edit mode
3.9 years ago

I think it depends. I think it is probably quite a good guide for Bioinformaticians that want to collaborate with "proper" computer scientists when they need new computer science to solve their problems.

I think its less good for biologists who need to collaborate with someone to give more computational nouce to their biology project.

I have two problems with it:

Firstly it states you shouldn't try to make biologists out of computer scientists, but then goes into a whole load of computer science stuff that biologists ought to learn. It says that instead of teaching CS people to be biologists, they should just teach the intution necessary. But I kind of feel that the point of intuition is that it can't be taught. One's intuition is constructed for ones self by immersion within the field. When I look at an alignment in IGV, I'm not looking for particular things (or not consciously anyway), but rather things that are wrong are obvious only after you've seen them: my subconscious intiution highlights these things to my conscious brain. You gain this by having a feel for biology.

Secondly it says that biology people must recognize that the CS people are interested in publishing novel CS, not solving biology problems. I'm not saying that's wrong, but rather that this is why most non-computationally aware biologists would be better off collaborating with a bioinformatician or computational biologist: there just isn't novel CS to be had in most of the things a biologist needs to solve. Thus the computationalist needs to be motivated by solving the biological problem, not the computational one. As most CS people aren't thus motivated, they are the wrong people to collaborate with most of the time.

One give away is that the author talks about different programming languages, saying that CS people don't care about the lanuage. I've never met a wet-only biologist that cares at all about different languages - they couldn't' care less if a program takes 6 horus or 1 to run. But who does care (often exceesively) and engage in language flame wars (mostly in a tongue in cheek way)? Bioinformaticians.

ADD COMMENT
0
Entering edit mode
3.9 years ago

Perhaps number 5 I would phrase in terms of learning about the scalability/limitations/tradeoffs of the various algorithms instead of specifically calling out the speed as a bottleneck.

Many other bottlenecks may be present, having a sense of the computational complexity would help the communication process.

ADD COMMENT

Login before adding your answer.

Traffic: 1815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6