Question: Bowtie2 and BWA-MEM giving very different results in metagenomic data
1
gravatar for elsoja
20 days ago by
elsoja100
elsoja100 wrote:

I've assembled a metagenome using MEGAHIT and begun testing different mapping options to perform the binning of the contigs. However, I've noticed that Bowtie2 and BWA-MEM had very different mapping rates to the metagenome:

Bowtie2:

182783328 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
55100932 + 0 mapped (30.15% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
46385654 + 0 properly paired (25.38% : N/A)
49135024 + 0 with itself and mate mapped
5965908 + 0 singletons (3.26% : N/A)
2364578 + 0 with mate mapped to a different chr
1755482 + 0 with mate mapped to a different chr (mapQ>=5)

BWA-MEM:

184716120 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
1932792 + 0 supplementary
0 + 0 duplicates
116629821 + 0 mapped (63.14% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
93117324 + 0 properly paired (50.94% : N/A)
107719404 + 0 with itself and mate mapped
6977625 + 0 singletons (3.82% : N/A)
14496150 + 0 with mate mapped to a different chr
11129016 + 0 with mate mapped to a different chr (mapQ>=5)

BWA-MEM mapped way more reads than Bowtie2. As the metagenome was assembled using those reads, I think that Bowtie2 mapping only 30% of them is quite strange.

What may be causing this difference? As the .bam file will be used for binning, using the output of one tool or the other will greatly affect downstream analysis.

Thanks!

ADD COMMENTlink written 20 days ago by elsoja100

Can you post the bowtie2 and bwa commands used?

ADD REPLYlink written 20 days ago by h.mon28k

Sure! They were both executed with the default parameters.

bwa mem -t 80 bwa_index reads_1.fastq.gz reads_2.fastq.gz | samtools view -bS - > bwa.bam
bowtie2 --threads 80 -x bowtie_index -1 reads_1.fastq.gz -2 reads_2.fastq.gz | samtools view -bS - > bowtie2.bam
ADD REPLYlink written 20 days ago by elsoja100
2
gravatar for ATpoint
20 days ago by
ATpoint26k
Germany
ATpoint26k wrote:

I guess a fair comparison would require to run bowtie2 in --local mode as its default is end-to-end, whereas bwa mem defaults (afaik) with local / soft-clipped alignments.

ADD COMMENTlink modified 20 days ago • written 20 days ago by ATpoint26k
2

I think you are right. I knew that BWA-MEM uses soft-clipping, but I've never checked whether that's the case for Bowtie2. Indeed, Bowtie2 is end-to-end.

I'll do a test and post the result here.

ADD REPLYlink written 20 days ago by elsoja100

I don't think this is the main source of the problem (even though it certainly contributes to it). When I use local alignment the % of mapped reads goes from 30.15% to 42.56%. It's a large increase, but it's still far from BWA-MEM.

182783328 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
77797217 + 0 mapped (42.56% : N/A)
182783328 + 0 paired in sequencing
91391664 + 0 read1
91391664 + 0 read2
66792824 + 0 properly paired (36.54% : N/A)
70257200 + 0 with itself and mate mapped
7540017 + 0 singletons (4.13% : N/A)
3182446 + 0 with mate mapped to a different chr
2437169 + 0 with mate mapped to a different chr (mapQ>=5)

I'm getting similar results for other metagenomic datasets that I'm analyzing.

ADD REPLYlink modified 19 days ago • written 19 days ago by elsoja100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 988 users visited in the last hour