Entering edit mode
3.2 years ago
haruki
▴
20
Hi there,
I tried GATK(v 4.1.8.1) to create VCF from 255 bam files.
First, I created a gVCFs from each bam file using the following command:
gatk HaplotypeCaller -R reference.fa -I input.bam -O output.vcf.gz -ERF GVCF
Next, I ran the following command to merge the gVCFs.
gatk GenomicsDBImport --genomicsdb-workspace-path my_database \
--batch-size 100 \
-L Chr01 \
--sample-name-map sample.txt
gatk GenotypeGVCFs -R reference.fa -V gendb://my_database -O merged.vcf.gz
These commands were successfully done and the merged VCF was obtained.
I then checked the VCF and found some wrong GT(genotype) in my VCF.
This is one example.
0/0:7,0:7:21:.:.:0,21,274:. .:0,0:.:.:0|1:4877_G_A:.:4877 0|1:3,2:5:75:0|1:4877_G_A:75,0,120:4877
VCF FORMTA is GT:AD:DP:GQ:PGT:PID:PL:PS
My samples are diploid plant genome, but in this example, the second individual has a haploid genotype.
Why does it happend?
Thank you!