Question: Mapping mate-pairs reads with bowtie2
I am mapping mate-pair reads with bowtie2 (I use bwa for most of the applications, but I wasn't happy with the result). I am using following commands

    bowtie2 -X 500 -S MPout.sam --fr --very-sensitive -x reference.fasta -1 MPR1.fastq MPR2.fastq

    bowtie2 -X 10000 -S MPout.sam --fr --very-sensitive -x reference.fasta -1 MPR1.fastq MPR2.fastq

(I reverse complemented the reads prior to the mapping so fr orientation should be correct)


The results for both commands and 10,000 reads are identical:

2500 reads; of these:

  2500 (100.00%) were unpaired; of these:

    2168 (86.72%) aligned 0 times

    60 (2.40%) aligned exactly 1 time

    272 (10.88%) aligned >1 times

13.28% overall alignment rate


This doesn't seem right, correct -X parameter should in my opinion result in higher proportion of reads mapped. Or am I missing any important parameter? I might be, I have never used bowtie2 with mate pair-reads. There shouldn't be any paired-end read contamination in my library (I filtered those out already).

Thanks for help. If you recommend something other than bowtie2, could you please provide command you use for mate-pair reads



I recommend try BBMap; it will generally yield a substantially higher alignment rate. ref=reference.fasta in1=MPR1.fastq in2=MPR2.fastq rcs=f out=mapped.sam

For even higher sensitivity, you can add the "slow" flag, but that should not be necessary.

Your mapping rate seems low overall, are you mapping against a distant genome? Why didn't you use the flag --rf with the original read orientation?

why would you filter out paired-end reads? to my understanding they differ to mate-pairs only by the "insert" size, meaning the fragment length.

I use GSNAP and I am happy with my results.

My code is here:

Specifically look for commented lines:

# if mate pairs use: --orientation=RF
# and specify the insert size using −−pairlength=2000 (for 2kb insert)
# and −−pairmax=5000 (Max total genomic length for DNA-Seq paired reads)

EDIT: you may want to comment out --split-output to avoid splitting the output file

