Question: How common are Deletions and Indels?
gravatar for beausoleilmo
3.9 years ago by
McGill University
beausoleilmo330 wrote:

I extracted some information of a VCF file to output summary statistics. I have 96 individuals that were sequenced from a RADseq protocol. I ran RTG's vcfstats. Here is my output (modified from the output to see more information at once).

Each line is an individual and the columns are in order:

 [1] "Deletion Het/Hom ratio"        "Deletions"                     "Indel Het/Hom ratio"          
 [4] "Indel/SNP+MNP ratio"           "Indels"                        "Insertion Het/Hom ratio"      
 [7] "Insertion/Deletion ratio"      "Insertions"                    "Missing Genotype"             
[10] "MNP Het/Hom ratio"             "MNPs"                          "Same as reference"            
[13] "Sample Name"                   "SNP Het/Hom ratio"             "SNP Transitions/Transversions"
[16] "SNPs"                          "Total Het/Hom ratio"           "sp"

enter image description here

I was wondering if it was normal or common to have 0 deletion and indel. I find it weird to see only 0's. I'm I the only one seeing this in his data?

vcftools rtg radseq • 1.7k views
ADD COMMENTlink modified 3.9 years ago by Brian Bushnell17k • written 3.9 years ago by beausoleilmo330
gravatar for Brian Bushnell
3.9 years ago by
Walnut Creek, USA
Brian Bushnell17k wrote:

This depends on both your sequencing platform and processing methodology. Some aligners (bowtie 1 for example), simply don't allow indels. Some platforms, like Complete Genomics and Solid, make indel calling very difficult and inaccurate. It's possible that something about your RADseq protocol precludes them (I'm not experienced with RADseq data). For exon capture with Illumina reads and an indel-capable pipeline, I'd expect to see a lot of indels. Not as many as SNPs; maybe 10% as many.

ADD COMMENTlink written 3.9 years ago by Brian Bushnell17k

I used BWA for the alignment. I specified nothing in my pipeline related to deletion/indel removal. Maybe it's a default parameter one of the function I used in the pipeline...

ADD REPLYlink written 3.9 years ago by beausoleilmo330

What was your variant caller, and what SAM/BAM version is the BWA output? I observed the same phenomenon with BBMap aligner (output SAM v1.4 by default) plus FreeBayes (expected v1.3 input). Solved by adding the 'sam=1.3' flag to the aligner.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by harold.smith.tarheel4.6k

I have BWA 0.7.13-r1126 and I'm not sure on how to see the SAM/BAM version. If it is the samtools version it's Version: 1.3.1 (using htslib 1.3.1), VCFtools (0.1.15) and BCFtools Version: 1.3.1 (using htslib 1.3.1). Could it be a problem for version 1.3.1 or I should have 1.3?

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by beausoleilmo330
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2079 users visited in the last hour