Question: what is the properties of filtering the vcf files
0
gravatar for stat.1405
3.6 years ago by
stat.140510
uk
stat.140510 wrote:

I have a filter file , it was filtered based on low quality<11 , indelGap , snpgap.

First, why they choose the threshold 11 ?

Second, what is the meaning of snpgap and IndelGap ?

Finally, is there any way or evidence that tells me the data should be filtered or not or this data had enough filtering.

 

I am a statistician, i need to know about these kind of things, if there is a paper or book can help me more in vcf tools and format, it will be helpful.

 

Thanks. 

 

snp next-gen R vcf • 1.4k views
ADD COMMENTlink modified 3.6 years ago by Ashutosh Pandey11k • written 3.6 years ago by stat.140510
2
gravatar for Ashutosh Pandey
3.6 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

1) Indel Gap, in context of filtering refers to minimum distance between an Indel and a SNP. Indels can cause mapping artifacts and may generate false positive SNPs nearby. Thus, SNPs that lie in the vicinity of an indel are filtered. I use 20 bp threshold. So in my filtered VCF file, there will be no SNPs within 20 bp of Indel.

2) SNP gap is a similar concept. If you see a cluster of SNPs within a short window, then its highly likely that all of them are false positives. For example, if you have a 20 bp region with 3 or more SNPs then it is highly likely that they are false positives.

3) I dont know which threshold you are talking about. Is it the variant quality score or minimum base quality score ? Again different variant calling tools like GATK and Samtools produce different range of variant quality scores and people use different cutoffs for different tools. For example, GATK suggests using a variant quality score of 30 or more. 

 

 

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1303 users visited in the last hour