Question: Calling large indels from NGS
gravatar for Adrian Pelin
4.9 years ago by
Adrian Pelin2.4k
Adrian Pelin2.4k wrote:


I have population sequencing (125bp SE) of an organism, and I have managed to find large indels spanning several kb with bbmap tools. See here how the data looks when the .sam file is visualized.

It looks like some might even be supported by 2 reads or more (same start same stop).

Is there anyway to call these? What software would you recommend? I tried freebayes, but I can't get any calls for some reason.



I tried the following commands with samtools:
samtools mpileup -d 250 -m 1 -E --BCF --output-tags DP,DV,DP4,SP -f RefAnnotated_SPades_HXE-3.fasta -o mapped_AdaptTrim_trial1.bcf mapped_AdaptTrim.sort.bam
bcftools index mapped_AdaptTrim_trial1.bcf
bcftools call --skip-variants snps --multiallelic-caller --variants-only  -O v mapped_AdaptTrim_trial1.bcf -o mapped_AdaptTrim_trial1_vcf.vcf

However, the detected indels are not too long, maximum 50-100bp. The ones I am seeing in the image posted are a few kb in length.

samtools ngs large-indels • 2.4k views
ADD COMMENTlink modified 4.9 years ago by nchuang210 • written 4.9 years ago by Adrian Pelin2.4k

Do those long thin lines represent deletions? If so, they mostly look like false mappings. Having just two supporting reads amongst those many false hits is too weak to be useful.

ADD REPLYlink written 4.9 years ago by lh332k

Yes they are deletions. In one case, there are 14 reads supporting one 11kb deletion (different sample). I got freebayes to call long deletions, however, the problem that I am running into is that if freebayes called a deletion from position X to position Y, it will not call a smaller deletion that is located between positions X and Y, even though that one might have more reads supporting it. I just need something to call them, then I can try PCR.

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by Adrian Pelin2.4k

You'd better try lumpy/delly first, using their preferred mappers. They look at split reads as well as read pairs. They are peer-reviewed and independently evaluated. I know too little about bbmap split read to give any useful advice.

ADD REPLYlink written 4.9 years ago by lh332k

Will do, however, at least lumpy takes as input bwa-mem. I tried it, and it only find deletions that are maximum 20bp in 125bp reads under default parameters. Is it safe to tweak them, or do lump/delly take into account unmapped reads as well?

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.4k

bwa-mem will produce two hits for split reads. At least lumpy uses this information. Giving two hits is more general. You can't put inversions or translocations in one sam line.

ADD REPLYlink written 4.9 years ago by lh332k

Ok fair point. I just successfully ran lumpy, and it identified 2 candidates over 10kb, one supported by 4 split reads, and one supported by 20 or so split reads and a few PE reads.

Is there any way to run it only on SE reads (as to only look for split reads)? My fragment length was ridiculously low, I ended up merging most of my reads and treating them as SE. I suppose I could trick the software by creating fake PE reads (reverse compliment my SE reads and call them R2), but I hate doing that.

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.4k

Delly does support SE reads for InDel detection since version v0.7.1.

ADD REPLYlink modified 9 months ago by RamRS30k • written 4.8 years ago by trausch1.5k
gravatar for h.mon
4.9 years ago by
h.mon31k wrote:

Subread can detect indels up to 200bp, according to its user guide.

Maybe you will have better luck with CNV detection software? People on my lab use CNVnator and they seem to like it.

ADD COMMENTlink written 4.9 years ago by h.mon31k

Since the coverage of these indels is so small, sometimes just a few reads out of a 3000x genome coverage, I am afraid CNV-related software will not pick up these differences in coverage. I am wondering if I can use some sort of RNA prediction of introns for this.

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.4k
gravatar for nchuang
4.9 years ago by
United States
nchuang210 wrote:

MELT should be released soon and that was featured in the recent Nature paper. Maybe it's out already?

ADD COMMENTlink written 4.9 years ago by nchuang210

Can't find it anywhere, seems a lot of tools designed for evaluating DNA melting temperatures start with or have melt in them.

ADD REPLYlink written 4.9 years ago by Adrian Pelin2.4k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 715 users visited in the last hour