Hi all,
I generated gVCF files with the intention of creating a multi-sample cohort VCF. However, some os these samples need to be analysed individually and, thus, need to be converted to VCF format. gVCF files generated with the GATK pipeline have <non_ref> as symbolic allele. This is not handled by BCFTools convert --gvcf2vcf command, since the <non_ref> still appears in the converted VCF (which, thus, is not handled by other tools in upstream analyses).
GATK GenotypeGVCFs command can handle this convertion but is very slow. Is there an efficient way to generate a VCF file from gVCF with the ALT allele instead of the <non_ref> symbolic allele?
Thank you in advance!
Maira
I am new to GATK practices. Can you please elaborate?
https://gatk.broadinstitute.org/hc/en-us/articles/360035535932-Germline-short-variant-discovery-SNPs-Indels-
Thank you! I was wondering if it is a norm to separate individual samples from combined Genotypegvcf after vqsr? Or does that somehow hinder the quality of the data?