Entering edit mode
12.9 years ago
Travis
★
2.8k
Hi,
I am trying to detect deletions approximately 66bp in length by mapping 90 base reads to the human genome with BWA (and using GATK). By fine tuning the alignment parameters I have managed to increase my maximum detected indel length to around 30 bases but so far I have been unable to improve on that. I have increased the maximum number of gap extensions to 200 and reduced the gap extension penalty to zero so in my head at least, I thought the larger indels would be detectable. Can anyone offer any insight?
Thanks in advance.
You might consider artificially splitting your read into two fragments in various lengths and mapping them separately? Tophat uses that strategy for transcript splicing.
Hi @Travis, are you trying to find deletions in the reference that are insertions in your sequencing sample, or deletions in your sample that are insertions in the reference, and split your reads into halves?
At the minute I am trying to find two deletions in the 60-66bp range. The deletion is in the reads, not the reference.
I have detected deletions in the 45-64b range using bwa/gatk (100bp PE Illumina), but haven't had them validated so.. I don't want to say much more than that. I was under the impression from recent conference talks that identification of deletions over 30bp (1/3rd read length) wasn't likely to happen with this particular toolchain.
Cheers Daniel. I am attempting to tweak another parameter or two so will see what happens on that front and update. I am determined to do it somehow!
Good advice, but I am specifically attempting to address the program without customization of anything other than existing program parameters.
You could also try other aligners such as blat, gsnap, and bfast.
Bfast will be my next port of call if I can't get the desired resuls from bwa. I spotted Nils commenting somewhere about the fact that his tests showed better indel detecting abilities than with bwa. The only drawback is speed.
Bfast will be my next port of call if I can't get the desired results from bwa. I spotted Nils commenting somewhere about the fact that his tests showed better indel detecting abilities than with bwa. The only drawback is speed.
Does anyone know what is limiting BWA's ability to find a 63bp deletion when it can find a 30bp deletion without a problem? Gap extension penalties are turned off.