Is it ok to replace missing WGS calls with reference notation "0/0"?
1
1
Entering edit mode
2.5 years ago

I called variants on 200 WGS samples, each got around 4 mil variants, however, most were unique and only 1 mil variants overlapped between most individuals.

I suppose it is normal behaviour that GATK won't output info about homozygote reference variants in genome, right? So the missing spaces should be simply filled with reference notation "0/0". Or shouldn't it?

Is there a correct way to fixt for low overlap? Otherwise it greatly complicates rare variant analysis.

wgs • 810 views
ADD COMMENT
3
Entering edit mode
2.5 years ago

So the missing spaces should be simply filled with reference notation "0/0".

no. If there is no read (e.g. an homozygous deletion), or if the quality is just too low then the call would be './.'

each got around 4 mil variants

there should be no "each" in gatk. Those samples should be called in GVCF mode and the combined with GenomicDBImport + GenotypeGVCFs

ADD COMMENT
0
Entering edit mode
  1. Would GATK put 0/0 if reads contained only reference allele? Or is it the expected behaviour only in GVCF mode?

  2. If i called them as a regular .vcf's is there a way to fix it and recall in GVCF mode apart from doing the whole thing anew?

ADD REPLY
2
Entering edit mode
  1. no if the quality/depth is just too low.
  2. no.
ADD REPLY

Login before adding your answer.

Traffic: 2552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6