Question: How common are Deletions and Indels?
23 months ago
McGill University
beausoleilmo210 wrote:

I extracted some information of a VCF file to output summary statistics. I have 96 individuals that were sequenced from a RADseq protocol. I ran RTG's vcfstats. Here is my output (modified from the output to see more information at once).

Each line is an individual and the columns are in order:

 [1] "Deletion Het/Hom ratio"        "Deletions"                     "Indel Het/Hom ratio"          
 [4] "Indel/SNP+MNP ratio"           "Indels"                        "Insertion Het/Hom ratio"      
 [7] "Insertion/Deletion ratio"      "Insertions"                    "Missing Genotype"             
[10] "MNP Het/Hom ratio"             "MNPs"                          "Same as reference"            
[13] "Sample Name"                   "SNP Het/Hom ratio"             "SNP Transitions/Transversions"
[16] "SNPs"                          "Total Het/Hom ratio"           "sp"

enter image description here

I was wondering if it was normal or common to have 0 deletion and indel. I find it weird to see only 0's. I'm I the only one seeing this in his data?

written 23 months ago by beausoleilmo210
23 months ago
Walnut Creek, USA
Brian Bushnell16k wrote:

This depends on both your sequencing platform and processing methodology. Some aligners (bowtie 1 for example), simply don't allow indels. Some platforms, like Complete Genomics and Solid, make indel calling very difficult and inaccurate. It's possible that something about your RADseq protocol precludes them (I'm not experienced with RADseq data). For exon capture with Illumina reads and an indel-capable pipeline, I'd expect to see a lot of indels. Not as many as SNPs; maybe 10% as many.

written 23 months ago by Brian Bushnell16k

I used BWA for the alignment. I specified nothing in my pipeline related to deletion/indel removal. Maybe it's a default parameter one of the function I used in the pipeline...

written 23 months ago by beausoleilmo210

What was your variant caller, and what SAM/BAM version is the BWA output? I observed the same phenomenon with BBMap aligner (output SAM v1.4 by default) plus FreeBayes (expected v1.3 input). Solved by adding the 'sam=1.3' flag to the aligner.

written 23 months ago by harold.smith.tarheel4.2k

I have BWA 0.7.13-r1126 and I'm not sure on how to see the SAM/BAM version. If it is the samtools version it's Version: 1.3.1 (using htslib 1.3.1), VCFtools (0.1.15) and BCFtools Version: 1.3.1 (using htslib 1.3.1). Could it be a problem for version 1.3.1 or I should have 1.3?

written 23 months ago by beausoleilmo210
