Question: Bwa And Bwa Mem Produce Different Alignments
1
gravatar for Rad
3.8 years ago by
Rad780
Canada
Rad780 wrote:

Hello

I have MiSeq paired-end samples that I aligned before using the classical approach (bwa aln, bwa sampe, bwa fixmate ) and that I realigned using bwa_mem.

Once the bam files generated and sorted I run samtools flagstat to get the difference between the two results and here is what I found :

bwa

3685558 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
452652 + 0 mapped (12.28%:nan%)
3685558 + 0 paired in sequencing
1842779 + 0 read1
1842779 + 0 read2
158326 + 0 properly paired (4.30%:nan%)
426744 + 0 with itself and mate mapped
25908 + 0 singletons (0.70%:nan%)
139248 + 0 with mate mapped to a different chr
80182 + 0 with mate mapped to a different chr (mapQ>=5)

and bwa mem :

3937425 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
3216740 + 0 mapped (81.70%:nan%)
3937425 + 0 paired in sequencing
1968780 + 0 read1
1968645 + 0 read2
291160 + 0 properly paired (7.39%:nan%)
2523397 + 0 with itself and mate mapped
693343 + 0 singletons (17.61%:nan%)
2097921 + 0 with mate mapped to a different chr
1488875 + 0 with mate mapped to a different chr (mapQ>=5)

Checking the mapped reads only using samtools -c -F4 I found 3216740 for the method with bwa mem and 452652 for the classical bwa method

Although this points to that bwa mem is much better, I find it strange that the mate mapped to different chr is huge with bwa mem in comparison to bwa.

Any idea on the significance of such difference ? on some locations I see huge coverage in comparison to the alignment generated with the classical bwa version

Thanks in advance

samtools • 3.6k views
ADD COMMENTlink modified 3.8 years ago by Charles Warden4.9k • written 3.8 years ago by Rad780
1

That's actually 15% (bwa mem) mapped to different chromosome against 30% (bwa), which makes of bwa mem result better than the first one, am I correct ? especially with high % of mapped reads

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Rad780
0
gravatar for Charles Warden
3.8 years ago by
Charles Warden4.9k
Duarte, CA
Charles Warden4.9k wrote:

Is there something special about your sample preparation? I think both alignments seem to indicate a relatively high translocation rate and I would typically see >90% alignment rate with either normal BWA or BWA-MEM when working with DNA-Seq data in a standard organsim (such as a human exome dataset). So, the normal BWA alignment seems abnormal (and even BWA-MEM seems suboptimal)

I would also agree with the comment from aradwen - don't forget that you need to consider the proportion of alignment types to the number of aligned (not total) reads

ADD COMMENTlink written 3.8 years ago by Charles Warden4.9k

Thanks @cwarden45, these are mouse DNA amplified with human primer, most of the primers should fail to amplify the mouse genome. Does this explain the high level of translocation ?

ADD REPLYlink written 3.8 years ago by Rad780

I don't know - I think there was one case where I initially aligned a mouse RNA-Seq dataset to the human genome by accident. I remember the alignment percentage going way down (< 50%) but I don't recall what the distribution of aligned reads looked like (and it might have been single-end data anyways). So, I think this might explain why the alignment percentage was low, but I don't think I can say much else.

If you want to only look at mouse DNA, you can first filter out reads that align to the human genome and then see what the aligned read distribution looks like among the remaining reads. I'm guessing you are targeting genes and not expecting any translocations within the gene, so perhaps you can see if this strategy increases the number of "properly paired" alignments.

Also, I apologize for not noticing that you commented on your own question ;)

ADD REPLYlink written 3.8 years ago by Charles Warden4.9k

Thank you cwarden45, actually the problem is not why the alignment percentage is low (it has to be like that actually) the problem I dont get is why it is high with bwa mem !! reads are not supposed to have that much high level of mapping !! Strange that bwa mem is reporting this

ADD REPLYlink written 3.8 years ago by Rad780

I think that is strange - I don't know for sure, but maybe it has to do with something strange about your to the library preparation. For example, I would say the alignment rate for BWA and BWA-MEM is both >90% in the data that I have worked with. I've never actually seen a sample like this. Maybe someone else can provide more specific help.

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by Charles Warden4.9k

neither bwa nor bwa-mem should align at 90% this sample, this is a control sample (mouse DNA aligned to human genome). bwa makes sense, bwa-mem not at all. I looked at the alignments with IGV and it is a messy alignment, even though there is a high % of mapping. (mapping <> good alignment)

ADD REPLYlink written 3.8 years ago by Rad780

Hi Rad,

Any update on this issue? Did you figure out what the problem was?

ADD REPLYlink written 3.0 years ago by amirmhzadeh70
1

Yes, we had an experiment in the lab that was not suppose to give a bullet proof alignment, bwa-mem failed in the sense that it was trying to align the sequences anyway, which is not supposed to happen, which means the result does not make sense biologically. I ended up preferring bowtie2 on bwa for these analyses in particular.  

ADD REPLYlink written 3.0 years ago by Rad780

Perhaps this has something to do with the fact that bwa-mem uses local alignment and bwa-aln uses global alignment. The way you prepared you library may gets lots of reads cannot be completed aligned to the reference (meaning there are lots of mismatches in the middle). Reads like that are more acceptable for local alignment but not for global alignment.

ADD REPLYlink written 4 months ago by CY60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 643 users visited in the last hour