Mapping mate-pairs reads with bowtie2
1
0
Entering edit mode
8.9 years ago

Hello,

I am mapping mate-pair reads with bowtie2 (I use bwa for most of the applications, but I wasn't happy with the result). I am using following commands

bowtie2 -X 500 -S MPout.sam --fr --very-sensitive -x reference.fasta -1 MPR1.fastq MPR2.fastq
bowtie2 -X 10000 -S MPout.sam --fr --very-sensitive -x reference.fasta -1 MPR1.fastq MPR2.fastq

(I reverse complemented the reads prior to the mapping so fr orientation should be correct)

The results for both commands and 10,000 reads are identical:

2500 reads; of these:
  2500 (100.00%) were unpaired; of these:
    2168 (86.72%) aligned 0 times
    60 (2.40%) aligned exactly 1 time
    272 (10.88%) aligned >1 times
13.28% overall alignment rate

This doesn't seem right, correct -X parameter should in my opinion result in higher proportion of reads mapped. Or am I missing any important parameter? I might be, I have never used bowtie2 with mate pair-reads. There shouldn't be any paired-end read contamination in my library (I filtered those out already).

Thanks for help. If you recommend something other than bowtie2, could you please provide command you use for mate-pair reads?

aligner mate-pair bowtie2 • 5.0k views
ADD COMMENT
0
Entering edit mode

I recommend try BBMap; it will generally yield a substantially higher alignment rate.

bbmap.sh ref=reference.fasta
bbmap.sh in1=MPR1.fastq in2=MPR2.fastq rcs=f out=mapped.sam

For even higher sensitivity, you can add the "slow" flag, but that should not be necessary.

ADD REPLY
0
Entering edit mode

Your mapping rate seems low overall, are you mapping against a distant genome? Why didn't you use the flag --rf with the original read orientation?

ADD REPLY
0
Entering edit mode

why would you filter out paired-end reads? to my understanding they differ to mate-pairs only by the "insert" size, meaning the fragment length.

ADD REPLY
1
Entering edit mode
8.9 years ago
arnstrm ★ 1.8k

I use GSNAP and I am happy with my results.

My code is here:

https://github.com/aseetharam/common_scripts/blob/master/gsnap_pe_noclip_final.sh

Specifically look for commented lines:

# if mate pairs use: --orientation=RF
# and specify the insert size using --pairlength=2000 (for 2kb insert)
# and --pairmax=5000 (Max total genomic length for DNA-Seq paired reads)

EDIT: you may want to comment out --split-output to avoid splitting the output file

ADD COMMENT

Login before adding your answer.

Traffic: 2553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6