Question

Questions : Data management of whole exome sequencing

0

Entering edit mode

9.3 years ago

mangfu100 ▴ 800

Greeting All.

My post is about procedure of how to pre-process exome sequence files before variant calling.

I used bwa and bowtie each for increasing accuracy of my results and reducing false positives.

And then, next major steps are read duplicate removal, indel realignment and base quality score recalibration.

However, before stepping to read duplicate removal process, I heard that there is another minor step called 'AddOrReplaceReadGroup'.

Is it okay to ignore this minor preprocess? Or, ignoring this step will be resulting different variations?

I think that preprocessing is very important in detecting variants accurately. Therefore I ask a post in this forum.

next-gen-sequencing genome • 2.0k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by mangfu100 ▴ 800

Ram · Accepted Answer · 2015-01-17

4

Entering edit mode

9.3 years ago

GouthamAtla 12k

The read group information is necessary for variant calling with GATK. Either you can append this information while aligning or later using picard tool.

This tutorial might be helpful for you. Tutorial (How to analyze) on Whole Exome sequencing. Common Errors. Best Practices.

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by GouthamAtla 12k