Forum:Getting Things Done In Bioinformatics (More I Know, Less I Do?)
3
10
Entering edit mode
9.3 years ago

As you read biostars, blogs, tweets and papers, do you sometimes have the feeling that more you know, less able you are to get things actually done?

Are you spending hours/days removing adaptors that might not even matter with your current mapping algorithm? Are you comparing dozen of mapping and variant calling algorithms just to find out that for your research question any of those would actually do the job?

How much "keeping up to date" influences your own ability to analyze the data and in which direction? How many minor flaws in your data are you willing to overlook for the sake of speed of the analysis?

(I can move question to Forum if it would be more appropriate).

analysis literature Forum • 3.4k views
ADD COMMENT
0
Entering edit mode

Yeah, forum seems like the right place for this. Moving.

ADD REPLY
0
Entering edit mode

I think that the more you know, the more you actually getting done - without thinking and trying much. Good results, bad results, they speak for themselves.

ADD REPLY
0
Entering edit mode

I agree when talking about programming (being it higher-level language or bash/awk/sed one-liner); more you know, better your performance is (including not only accuracy, but also speed of progress). I am not 100 % sure when talking about biology, hence this post.

ADD REPLY
7
Entering edit mode
9.3 years ago

My two cents is that at some point, you have to realize that good enough is okay. No analysis is perfect, but yours should be intelligent, transparent and reproducible. Be aware of the limitations of your tools, and acknowledge them. Above all, keep moving forward, so that the next experiment you do is better than the last.

ADD COMMENT
5
Entering edit mode

The phrase "Don't let the perfect be the enemy of the good" comes to mind.

ADD REPLY
2
Entering edit mode

Similar idea: "Don't let better be the enemy of done".

ADD REPLY
4
Entering edit mode
9.3 years ago

Having too many options sometimes hinders the progress as lot of time is spent on pondering about relevant but not much important things.

Problems: 1) Too many softwares available to do a task: For example, there is a plethora of QC softwares available for fastq reads. Just search for fastq trimmers, you will find atleast 20 different tools that are almost doing the same stuff. Same is with aligners too.

Solution: Its good to spend some time playing with different tools just to get some idea. But stick to one for your tasks. For example, I use trimmomatic to trim the reads and BWA to align the reads. The reason I use trimmomatic is because it is easy to use, it is java based so runs very fast and takes care of the orphan reads in paired end data. Similarly, BWA is fast and produces BAM with relevant tags that I can use to further filter my aligned data. Sometimes, other people have already done the comparisons you are looking for. Try searching for relevant blogs and they will help you for sure to decide.

2) Trying to be too meticulous while doing analysis: This was a bad habit with me. I still suffer from it but its not that bad now. There was a time when I used to rerun the whole alignment if the new version of the aligner got released while I was still working on my final results. I had this impression that may be the new aligner is more better and will give me better results, though I knew that it wont make much difference when it comes to results from an aligner with version 1.6 and version 1.7. Similarly, steps like Base quality recalibration using GATK may not influence your end results substantially. But i see people struggling with it for many days because GATK keeps throwing an error regarding the chromosomal order incompatibilities between BAM, reference and vcf file. I am not saying its unimportant but if you have already QC your reads and have a reliable sequencer, I don't think it is much important.

Solution: Stop yourself repeating the analysis merely because you have a free cluster and it wont take much time. Its easy to get distracted and end up spending more time..

ADD COMMENT
3
Entering edit mode
9.3 years ago
ugly.betty77 ★ 1.1k

There used to be a time, when scientists were rewarded for challenging every assumption in their theories. I have been re-reading Darwin's book and was surprised to find out how much space and energy he spent to argue against potential flaws and pitfalls in his theory. He seem to have covered every imaginable species and trait, and did not leave anything untouched. Darwin had difficulty getting his book completed, and he could have easily taken another five or ten years to finish it, if he did not get rushed after hearing Wallace proposing a similar theory. The famous book is actually an 'abstract' of his theory.

http://en.wikipedia.org/wiki/On_the_Origin_of_Species#Events_leading_to_publication

Today's 'scientists' push papers to get grants to push papers to get grants. By today's standard, Darwin did not get anything done beyond 'submitting an abstract'. You cannot be like that. So, I suggest you stop asking too many questions about your tools, and follow these simple rules -

(i) You get more grants, if you can show more papers and many 'high-visibility' papers. Try to keep those two numbers high.

(ii) Give your paper a catchy and inflated title. That criterion has become almost essential to get into high visibility journal.

(iii) Write a strong abstract, and try to focus your bioinformatics work to make sure you have enough strong points for the abstract.

(iv) Figures are the next important part of the paper. Make sure you spend enough time to polish them up.

(v) Submit letter-type paper so that you do not have to spend too much time writing the remainder of the paper.

(vi) Materials and methods section - try to copy as much as possible from other published work. That way you can blame others, if someone objects about one or other step in the procedure.

(vii) If you are writing too many papers following (i) - (vi), the editors and reviewers will keep you busy, and you will not find too much time to read biostars, blogs, twitter, etc. Your primary problem is that you have too much time in hand. That is unacceptable, if you want to run a lab as a successful PI today.

ADD COMMENT

Login before adding your answer.

Traffic: 2239 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6