Question: Need Help Interpreting The Genotype Fields In A Vcf Formatted Data
gravatar for CrazyB
7.0 years ago by
United States
CrazyB210 wrote:

Need some help to understand VCF files (yes, I've read the info from 1kgenome and have some "basic" understanding of them).

In the genotype result, for example, I have the following SNP identified.

chr1    860461    G    A    98    PASS   GT:CQ:DP     1/1:98:4    ./.:98:5    ./.:98:5    0/0:.:.

The genotype for the 4 individuals are

AA, ?, ?, GG

My questions are

  1. why with read depth (DP) = 4 for individual #1, the genotype is "readable" and considered AA, whereas for individual #2, the read depth is 5, but the genotype cannot be called and hence ./.
  2. why for individual #4, nothing is readable but still there is this predicted GG genotype given.
  3. how did this SNP end up being given a "PASS" by the filter? To me, all 4 individuals have poor read at this position.

Any help? Great many thanks

vcf exome-sequencing • 1.8k views
ADD COMMENTlink modified 7.0 years ago by Gabriel R.2.7k • written 7.0 years ago by CrazyB210

Which variant caller produced this VCF? It might be helpful to view the output from bam-readcount for this position for all four of your bam files to understand exactly what reads support which bases and what the quality of those bases are.

ADD REPLYlink modified 7 months ago by RamRS27k • written 7.0 years ago by Malachi Griffith18k

Thanks a lot for the feedback. I will follow your lead and ask my co-worker for the info. Have to apologize though for not being familiar with the jargon. This VCF came out of a medical center's genomics core facility and I believe they used the "standard" GATK from Broad for sequencing analysis (and made the call). Is this what you were asking?

ADD REPLYlink written 7.0 years ago by CrazyB210
gravatar for Gabriel R.
7.0 years ago by
Gabriel R.2.7k
Danmarks Tekniske Universitet
Gabriel R.2.7k wrote:

not sure but here goes:

  1. My guess is that the base quality is probably pretty bad. Check in the bam files using samtools mpileup
  2. GATK has a prior on seeing the reference. Sometimes it does not produce certain fields for homo. ref sites depending on the version and the # of bulls sacrificed prior to running GATK.
  3. Ask the GATK developers. It's a terrible "software" and genotyper.
ADD COMMENTlink modified 7 months ago by RamRS27k • written 7.0 years ago by Gabriel R.2.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1917 users visited in the last hour