Question: Bwa + Samtools
0
gravatar for Ashutosh Pandey
6.9 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

Hello Everyone,

I used BWA to align SOLiD mate pair reads (60,60) with parameters -n 8(total mismatch) -l 25 (seed) and -k 2 (mismatch in seed). I am getting a good mapping rate of around 65%.

BWA outputs all the reads disregard of whether they were mapped, unmapped, mapped in pairs and other bitwise flags. To solve this problem I converted my SAM file to BAM file. As I am not interested in inversions or some unusual variant I had to filter out the SAM file so that it can be used for high confidence SNP and Indel calling. Then I used:

samtools view -b -f 67 -f 31 -f 179 -f 115 old.bam > new.bam

67 and 31 (paired, mapped and properly paired) 179 and 115 (paired, mapped, properly mapped and both mapped reverse complimentary same strand)

Once I got the new.bam BAM, I sorted it and removed the duplicates usign samtools and then used mpileup to call for the SNPs and indels.

Below are my Yes or No questions:

1) This is my first time doing a NGS analysis. Am I doing things correctly? Is the order of steps I am performing correct? 2) As I only want to use high confidant reads I have filtered out all the unmapped, not properly paired reads. Do you think the flagwise bits I have used are correct.

Though I tried to remove the duplicates using samtools for my mate pair bam data but I can still see lot of mate-pair reads mapped to the same position as other mate-pair reads. Some people have suggested using Picard. I used trim 3'end option in BWA. The reads that were duplicates before may not remain duplicates afterwards because the length of some reads got changed after trimming. Can anyone tell me how to resolve this issue.

Thanks -Ashutosh

samtools bwa • 2.7k views
ADD COMMENTlink written 6.9 years ago by Ashutosh Pandey11k
0
gravatar for Sean Davis
6.9 years ago by
Sean Davis25k
National Institutes of Health, Bethesda, MD
Sean Davis25k wrote:

I don't want to suggest that you follow this workflow exactly, but you might take a look at this page:

http://www.broadinstitute.org/gsa/wiki/index.php/Best_Practice_Variant_Detection_with_the_GATK_v3

ADD COMMENTlink written 6.9 years ago by Sean Davis25k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1148 users visited in the last hour