Question

Using both bowtie and bowtie2 for finding indels

1

Entering edit mode

9.2 years ago

markus.lippus ▴ 10

Hi!

I'm new to both RNA-seq and bioinformatics itself.

In the project that we're working on we sequenced bacterial RNA from different samples and one of the things we're trying to find from it are deletions and insertions resulting from mistakes in ligation of the RNA.

I have chosen to use bowtie2 for alignment as the argument about which aligner is better seems to be never-ending and also dependant on the situation. But I have seen that people have noted that bowtie2 might prefer a gapped alignment to another non-gapped alignment in some situations.

We had a crazy idea of first using bowtie to map all reads and then filter out all the reads that bowtie did not map(as bowtie doesn't map reads with indels) and maps them with bowtie2. It seems like this way I could calm myself about getting false positives due to alignment.

Is this a bad idea? I haven't seen anyone propose it, so I suppose there's something wrong with it.

indel RNA-Seq bowtie2 bowtie • 4.5k views

ADD COMMENT • link updated 24 months ago by Ram 43k • written 9.2 years ago by markus.lippus ▴ 10

Ram · Answer 1 · 2015-02-10

1

Entering edit mode

9.2 years ago

Devon Ryan 104k

It sort of depends on how correct you want the results to be. If you're just doing this to get an idea about what regions to look at then this is probably OK. If you want results that will stand up in and of themselves to peer review (at least if I end up being your reviewer), then this method has issues.

If bowtie2 is preferring a gapped alignment over an ungapped one then that's likely the correct result. Bowtie1 will simply incorrectly map reads with indels toward one end or the other. Maybe it'll map a read to the right place with some mismatches (this is OK), but it might also just map elsewhere in the genome and place the mismatches elsewhere in the alignment. This will mess up your results. In an ideal world, you'd just use bowtie2 with local alignment and then use a local realigner to realign everything (including the soft-clipped regions) around indels. GATK is the obvious tool for this, though it unfortunately doesn't realign soft-clipped regions of alignments. Presumably there are other local realigners that will handle soft-clipped alignments in a better way (the one I wrote will, but it'll only work with BSseq alignments, though perhaps I should change that).

ADD COMMENT • link updated 24 months ago by Ram 43k • written 9.2 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for your answer.

I forgot to write it in the OP, but I planned on allowing none to one mismatch while mapping with bowtie. In this case it should only map the reads that are perfectly mappable somewhere on the genome and leave the rest unmapped for bowtie2 to find gapped alignments. I would only filter out the reads that have a perfect alignment somewhere on the genome just in case bowtie2 doesn't find it.

ADD REPLY • link updated 24 months ago by Ram 43k • written 9.2 years ago by markus.lippus ▴ 10

0

Entering edit mode

Ah, allowing a very low edit distance alleviates some of my concerns. BTW, here's a bwa mem-based pipeline that would presumably work and can perform realignment with soft-clipped alignments. I've never used it, but that might be a nicer alternative.

ADD REPLY • link updated 24 months ago by Ram 43k • written 9.2 years ago by Devon Ryan 104k