Question: How do mappers behave at end of reference sequence?
gravatar for nickp60
4.2 years ago by
nickp6030 wrote:

How do the different mappers handle reads extending past the end of a reference sequence? I am unclear as to how BWA, Bowtie2, or <#insert favorite mapper> scores sequences which map t the end of a chromosome but also partially extend past the reference. In particular, I am interested in how to score a reads mapped to the 'end' of bacterial genomes (in reality, spanning the bacterial origin), when the genome must be represented as linear in the typical .fasta.

ADD COMMENTlink modified 4.2 years ago • written 4.2 years ago by nickp6030

See this thread and top rated answer: Circular Genome?? This behavior may still be current though I have not specifically looked. You may also see this as soft-clipping at the end/beginning of reads.

ADD REPLYlink modified 4.2 years ago • written 4.2 years ago by genomax91k

My concern is that there is a strong possibility (without knowing how the mappers handle such reads) that the bridging reads will be underrepresented or rejected.

ADD REPLYlink written 4.2 years ago by nickp6030

For most aligners you should expect a complete drop in coverage as you approach the edge of chromosomes/contigs. You might be able to get bwa mem to do the back splicing if your reads are long enough (it'll do is with supplemental alignments), but in general aligners are tailored toward mammalian chromosomes.

ADD REPLYlink written 4.2 years ago by Devon Ryan97k

Here is one solution posted on SeqAnswers.

ADD REPLYlink written 4.2 years ago by genomax91k

Quick Follow-up: I contacted the BWA package author, and he confirms that these overhangs are treated as clipping. I have not had any luck capturing more reads by changing/removing the clipping penalty, so it would be interesting to hear if anyone has found a way to do that.

A cursory run through with SMALT showed noticeably increased recovery of overhanging end reads, but I haven't done a very thorough benchmarking. Thanks all for your input; has anyone else found this limitation of BWA problematic?

ADD REPLYlink written 4.2 years ago by nickp6030
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1733 users visited in the last hour