Question: Not call some variants in some populations?
gravatar for star
14 months ago by
star260 wrote:

I have three VCF files from three populations where I merged all their SNPs together. For some variants, there are genotypes in all population but not for all regions, so I got 'NA', I like to know what does it mean? means: no variant call for this region in samples with 'NA' or can I consider 'NA' variants as reference genotype?

ID         CHROM    chromStart  chromEnd    REF     alleles  pop1   pop2  pop3

rs10084237  chr2    76517559    76517560    T       C,T,     NA      CC    NA   
rs10084293  chr2    70917811    70917812    C       C,T,     CT      TT    TT   
rs10084353  chr2    61020552    61020553    A       A,G,     AG      NA    GG
variantion vcf • 221 views
ADD COMMENTlink modified 14 months ago by bari.ballew230 • written 14 months ago by star260

How did you merge the variants? The exact method you used is what determines what these NAs mean.

ADD REPLYlink written 14 months ago by RamRS30k
gravatar for bari.ballew
14 months ago by
bari.ballew230 wrote:

To expand on what RamRS said, if you've merged VCFs and not gVCFs, note that VCFs only report a genomic location if there is a variant in that individual, so you are susceptible to a missing data problem. When a variant is reported in A.vcf, but not in B.vcf, the merged file will record the variant as missing "./." for sample B. Does that mean there was insufficient coverage to make a call, or was there plenty of coverage and simply no variant reads? If you're looking exclusively at very rare variants, then sometimes assuming a homozygous reference genotype for missing calls is appropriate, but it depends on the downstream analysis.

ADD COMMENTlink written 14 months ago by bari.ballew230
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1013 users visited in the last hour