Question: Using both bowtie and bowtie2 for finding indels
0
gravatar for markus.lippus
5.7 years ago by
Estonia
markus.lippus0 wrote:

Hi!

I'm new to both RNA-seq and bioinformatics itself.

In the project that we're working on we sequenced bacterial RNA from different samples and one of the things we're trying to find from it are deletions and insertions resulting from mistakes in ligation of the RNA.

I have chosen to use bowtie2 for alignment as the argument about which aligner is better seems to be never-ending and also dependant on the situation. But I have seen that people have noted that bowtie2 might prefer a gapped alignment to another non-gapped alignment in some situations.

We had a crazy idea of first using bowtie to map all reads and then filter out all the reads that bowtie did not map(as bowtie doesn't map reads with indels) and maps them with bowtie2. It seems like this way I could calm myself about getting false positives due to alignment.

 

Is this a bad idea? I haven't seen anyone propose it, so I suppose there's something wrong with it. 

bowtie rna-seq bowtie2 indel • 3.2k views
ADD COMMENTlink modified 5.7 years ago by Devon Ryan97k • written 5.7 years ago by markus.lippus0
1
gravatar for Devon Ryan
5.7 years ago by
Devon Ryan97k
Freiburg, Germany
Devon Ryan97k wrote:

It sort of depends on how correct you want the results to be. If you're just doing this to get an idea about what regions to look at then this is probably OK. If you want results that will stand up in and of themselves to peer review (at least if I end up being your reviewer), then this method has issues.

If bowtie2 is preferring a gapped alignment over an ungapped one then that's likely the correct result. Bowtie1 will simply incorrectly map reads with indels toward one end or the other. Maybe it'll map a read to the right place with some mismatches (this is OK), but it might also just map elsewhere in the genome and place the mismatches elsewhere in the alignment. This will mess up your results. In an ideal world, you'd just use bowtie2 with local alignment and then use a local realigner to realign everything (including the soft-clipped regions) around indels. GATK is the obvious tool for this, though it unfortunately doesn't realign soft-clipped regions of alignments. Presumably there are other local realigners that will handle soft-clipped alignments in a better way (the one I wrote will, but it'll only work with BSseq alignments, though perhaps I should change that).

ADD COMMENTlink written 5.7 years ago by Devon Ryan97k

Thank you for your answer.

I forgot to write it in the OP, but I planned on allowing none to one mismatch while mapping with bowtie. In this case it should only map the reads that are perfectly mappable somewhere on the genome and leave the rest unmapped for bowtie2 to find gapped alignments. I would only filter out the reads that have a perfect alignment somewhere on the genome just in case bowtie2 doesn't find it.

 

ADD REPLYlink written 5.7 years ago by markus.lippus0

Ah, allowing a very low edit distance alleviates some of my concerns. BTW, here's a bwa mem-based pipeline that would presumably work and can perform realignment with soft-clipped alignments. I've never used it, but that might be a nicer alternative.

ADD REPLYlink written 5.7 years ago by Devon Ryan97k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2323 users visited in the last hour