Mapping reads to repetitive positions in reference within edit distance and with 1 indel
0
0
Entering edit mode
4.1 years ago
pawlowac ▴ 80

I have a list of subsequences between 30 - 40 nt that I need to map to a reference. I expect these reads to align to several positions in the reference. I need to find all exact alignments, any alignments within a given edit distance, and also any positions that are off by a single indel.

I can do all that is required with bowtie1 except the indel.

BBmap looks like it should work for this purpose, but I can't seem to find a single indel using the following;

bbmapskinner.sh  in=kmers.fasta out=result.sam ambiguous=all strictmaxindel=1

Note: I haven't tried to catch all alignments within a given edit distance yet, but the sequence is exactly as in the reference but with a single deletion or insertion.

My issue overall is that I have many hundreds of subsequences that I need to map to ~800k relatively homogeneous sequences (rRNA) and this takes a very long time normally with Bowtie and a second step using the regex module in python.

Two related questions; 1. What am I doing wrong with BBmap? 2. Is there a better and faster way to do this? Note, I'm looking at edit distances ~25% of the 30 - 40 mer subsequences. I haven't strongly considered BLASTn because it doesn't perform well with short sequences, but maybe that is the best course of action here?

Additional information: The second pass with regex is because I need to take positions of mismatches into account. Bowtie does a good job of that, but I understand that bowtie is not guaranteed to find all mapping positions and I need to know exactly which reference sequences produce a match.

bbmap mapping bowtie • 724 views
ADD COMMENT

Login before adding your answer.

Traffic: 2898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6