Standard approach for telling if sites were removed during calling due to poor quality or non-variant?
0
0
Entering edit mode
4.1 years ago
curious ▴ 750

I am generating some VCFs from WES. I ran the BAMs through GATK standard workflow with no problem, but then realized that sites that are not represented in my final VCF could be:

A. Filtered out because of low quality or

B filtered out because they are non-variant

My original plan was re-running GATK to preserve non-variant sites at each step. This was quite a battle which I ultimately lost. Are there standard approaches to determine if sites were removed during calling because everyone was homozygous ref or because of low quality?

I am struggling quite a bit on this and to me it seems to be standard thing someone would want to know so I am worried I am missing something obvious

gatk vcf • 579 views
ADD COMMENT
0
Entering edit mode

variants that are not represented in my final VCF could be ... filtered out because they are non-variant

I'm sorry, what does that mean?

ADD REPLY
0
Entering edit mode

oops sorry, sites could be filtered out because non-variant/ all homozygous reference. If I understand correctly these sites are typically removed while running GATK

ADD REPLY
1
Entering edit mode

Yes, sites that are hom-ref for all samples are not included in the VCF file. They are included as blocks in gVCF files though. You'll need to look at your pipeline for hard filters. GATK usually uses soft filters to mark entries as PASS or <FILTER_NAME> to denote if a variant passed a QC filter, unless a parameter was used that hard-filters variants.

ADD REPLY

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6