Entering edit mode
16 hours ago
Ramnaresh
•
0
I’m performing germline variant calling using GATK (gVCF-based workflow) and would like to confirm whether the order of my steps is correct. What I have done so far:
CombineGVCFs ==> cohort.g.vcf.gz
GenotypeGVCFs ==> output.vcf.gz
VariantFiltration (ExcessHet) ==> cohort_excesshet.vcf.gz
bcftools norm -m-any --check-ref -w -f "$reference" cohort_excesshet.vcf.gz -o cohort.nm.vcf.gz
MakeSitesOnlyVcf ==> cohort.sitesonly.vcf.gz
VQSR
I’m trying to generate a high-quality cohort VCF and will later analyze per-patient variants. Is this the correct order of steps? Should normalization (bcftools norm) or VariantFiltration be performed before or after VQSR?
System:
GATK 4.6.2.0
Reference: GRCh38
~140 samples (gVCFs)
16 GB RAM
Any suggestions or corrections are appreciated!
Thanks in advance.