bowtie2 '-a' does not return all alignments
0
0
Entering edit mode
3.8 years ago
shim ▴ 10

Hi I have an amplicons library with a few hundred targets that are similar, I expect no gaps, and for simplicity I expect no mismatches (so I penalize).

I use bowtie2 2.4.1 with the following command:

bowtie2 -a -D 100 -R 5 --local --no-unal --gbar 800 --ma 2 --mp 50,50 --np 50 --rdg 50,30 --rfg 50,50 --threads 1 -x ../library/index/libidx -U /scratch/file.fastq -S /scratch/out.sam

I look at alignments with NM:i:0, the problem is that '-a' doesn't return all possible alignments (I tried with and without '-D', '-R' options). When I look manually (simple search w/o MM) for some aligned sequence that it finds, I see there are many more alignments that it missed, sometimes there is only 1 alignment listed with higher score and many alignments with lower score. I know that bowtie2 wasn't designed for these situations in terms of performance, but so far it was fast enough for me.

Is it a well known problem? is there an undocumented limitation?

in the example SAM output below (single read, not all alignments, NM:i:0 alignments, no header, sequences+QS columns removed), all reads should be mapped to WT_Reference (only the WT was sequenced in this case), but there are many other alignments Variant_20, with CIGAR=388M197S, is the best alignment with NM:i:0, it is found 21 times (including WT_Reference) but is listed only once. The library size is 205 sequences (WT_Reference + Variant_1...Variant_204)

thanks, Shim

@PG ID:bowtie2  PN:bowtie2  VN:2.4.1    "CL:""/Tools/Bowtie/bowtie2-2.4.1-linux-x86_64/bowtie2-align-s --wrapper basic-0 -a -D 100 -R 5 --local --gbar 800 --ma 2 --mp 50,50 --np 50 --rdg 50,30 --rfg 50,50 --threads 1 -x ../library/index/libidx --passthrough -U /scratch/file.fastq"""                                                 
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_20: 602 255 388M197S    *   0   0   AS:i:776    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:388    YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_25: 602 255 161M424S    *   0   0   AS:i:322    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:161    YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_26: 602 255 160M425S    *   0   0   AS:i:320    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:160    YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_21: 602 255 137M448S    *   0   0   AS:i:274    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:137    YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_77: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_76: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_72: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_74: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_78: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_75: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_73: 602 255 79M506S *   0   0   AS:i:158    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:79 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_172:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_175:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_136:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_170:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_171:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_135:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_139:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_140:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_173:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_176:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_137:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_174:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_138:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_141:    602 255 68M517S *   0   0   AS:i:136    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:68 YT:Z:UU
M01015:88:000000000-J5K85:1:1101:12543:1681 256 Variant_185:    602 255 67M518S *   0   0   AS:i:134    XS:i:910    XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:67 YT:Z:UU
...
bowtie2 score alignment SAM • 1.2k views
ADD COMMENT
0
Entering edit mode

Short read aligners were never designed to be mathematically optimal or to produce all equally scoring alignments. Thus I would not recommend using that for this type of use cases.

ADD REPLY
0
Entering edit mode

Thank you Istvan, I'm aware short read aligners don't produce all possible alignments, but I do expect a high coverage of the best alignments (e.g. > 95%; and missing 20/21 of a particular score' alignments is a problem). Often there are "Effort options" (like bowtie2's '-D', '-R') that should produce more results probably at a significant performance cost, but, as I mentioned, that didn't help.

Do you know a tool that better suits my use case? i.e. mapping many fastq short reads to < 300 variants that are similar, it would be best if it output all the best-rank-alignments (similar to bowtie1's -a --best --strata).

PS my analysis is somewhat similar to microbiome analysis, but so far I found only mothur tool (Schloss, et al., 2009) which I'm not sure suits my pipeline..

best, shim

ADD REPLY
0
Entering edit mode

Try bbmap.sh as an alternate. It has a ambig=all option that should report all possible alignments for a read. There is a guide available. There are many options available on command line. You will need to spend some time experimenting.

ADD REPLY
0
Entering edit mode

Thank you very much genomax, I will try bbmap (I saw it in microbiome pipeline somewhere but never used it)

ADD REPLY

Login before adding your answer.

Traffic: 2115 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6