Question: Not call some variants in some populations?
gravatar for star
17 months ago by
star270 wrote:

I have three VCF files from three populations where I merged all their SNPs together. For some variants, there are genotypes in all population but not for all regions, so I got 'NA', I like to know what does it mean? means: no variant call for this region in samples with 'NA' or can I consider 'NA' variants as reference genotype?

ID         CHROM    chromStart  chromEnd    REF     alleles  pop1   pop2  pop3

rs10084237  chr2    76517559    76517560    T       C,T,     NA      CC    NA   
rs10084293  chr2    70917811    70917812    C       C,T,     CT      TT    TT   
rs10084353  chr2    61020552    61020553    A       A,G,     AG      NA    GG
variantion vcf • 251 views
ADD COMMENTlink modified 17 months ago by bari.ballew250 • written 17 months ago by star270

How did you merge the variants? The exact method you used is what determines what these NAs mean.

ADD REPLYlink written 17 months ago by _r_am32k
gravatar for bari.ballew
17 months ago by
bari.ballew250 wrote:

To expand on what RamRS said, if you've merged VCFs and not gVCFs, note that VCFs only report a genomic location if there is a variant in that individual, so you are susceptible to a missing data problem. When a variant is reported in A.vcf, but not in B.vcf, the merged file will record the variant as missing "./." for sample B. Does that mean there was insufficient coverage to make a call, or was there plenty of coverage and simply no variant reads? If you're looking exclusively at very rare variants, then sometimes assuming a homozygous reference genotype for missing calls is appropriate, but it depends on the downstream analysis.

ADD COMMENTlink written 17 months ago by bari.ballew250
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1878 users visited in the last hour