Paper Or Detailed Tutorial For Dna Variant Calling Pipeline? Need Help To Start
2
2
Entering edit mode
10.8 years ago
newDNASeqer ▴ 760

I am a newbie to high-throughput DNA sequencing analysis, and have just started doing my postdoc in this area. I used to do wet bio, but have great deal of experience using Linux and writing code in Java and Python. It seems to me the learning curve is pretty steep in learning DNA variant calling.

Since I started working in this new lab, I have followed a Nature protocol paper to run RNA-Seq pipeline: Tophat - Cufflinks - CuffMerge - CuffDiff. I think the process is not hard, just lots of waiting time on the computer.

I am not sure where I should start for DNA variant calling. Can anyone give me some guide to a paper or an online step-by-step protocol? I appreciate your reply.

dna variant calling pipeline • 5.1k views
ADD COMMENT
5
0
Entering edit mode

I would add as well that some of your choices will also depend on whether you are doing whole genome sequencing or target-enrichment like Exome sequencing. I think there is a general problem in genomics studies right now of people not publishing their full pipelines for the analyses they did in enough detail but if you look through papers in the area you are working on, especially ones from the last 2-3 years, you should get an idea of what tools people are using and some paramater settings. Most people stick with fairly default paramaters and my personal feeling is that BWA + GATK is probably the most widely used protocol in general.

That said there are some papers showing the non-overlap of variants called with different pipelines run on the exact same data. One from this year was published in Genome Medicine and is worth reading: http://genomemedicine.com/content/5/3/28

That publication will give you an idea of a few different pipelines. You may also want to check out GCAT: http://www.bioplanet.com/gcat/ which has test datasets you can use to test your pipeline choices against other pipelines on the same data. Also lets you compare any four (at a time) of various pipeline setups on the same datasets.

ADD REPLY
0
Entering edit mode
10.8 years ago
rob234king ▴ 610

I’ve put together a tutorial website with four core comprehensive tutorials on it, RNA-Seq, ChIP-Seq, Genome assembly, and SNP calling.

http://elvis.ccc.cranfield.ac.uk/CUBELP2/

ADD COMMENT

Login before adding your answer.

Traffic: 3221 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6