Question: 4/6 genotype in VCF
0
gravatar for ajuiwl
14 months ago by
ajuiwl30
ajuiwl30 wrote:

Hi all, I have been doing variant calling analysis in human cancer exomes and some variants in the VCF file have genotype like 4/6 or 2/3, is there any explanation for this fact, since I expected to see only 0/1, or 1/1? Thank you very much

genotype variant calling vcf • 361 views
ADD COMMENTlink modified 14 months ago by Pierre Lindenbaum124k • written 14 months ago by ajuiwl30
1

Genotypes work like this:

  • 0: same as reference
  • 1: first alternate allele
  • ...
  • n: n-th alternate allele

Where n has to be < (strictly lower than) the number of letters that compose your base space. In case of DNA, you have four (A,C,T,G), so you can expect at most 4-1=3 alternative alleles.

My suspicion is that you're calling variants in a very homebrew way and something is messing up your genotyping. If not: can you post your variant calling pipeline?

ADD REPLYlink written 14 months ago by Macspider3.0k
1

The alternative alleles can contain multiple letters so this assertion that it is less that the "base space" e.g. ACGT is false. You can see things like

alternative_alleles=TC,TCC,TCCC

E.g. it can have multiple letters.

ADD REPLYlink written 14 months ago by cmdcolin1.3k
1

That is true, if you allow multiple nucleotide polymorphisms. My bad, I was imprecise :)

ADD REPLYlink written 14 months ago by Macspider3.0k

Thank you for the answers, the fact that I do not understand is how the calling in a diploid genome can yield variants with more than 3 alternative alleles (reference, copy 1 and copy 2) if only 1 sample is being used, is that possible?

ADD REPLYlink written 14 months ago by ajuiwl30
1

@cmdcolin literally answered this question of yours.

ADD REPLYlink written 14 months ago by Macspider3.0k
0
gravatar for Pierre Lindenbaum
14 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

is there any explanation for this fact

yes, read the VCF spec: https://samtools.github.io/hts-specs/VCFv4.2.pdf

>

ALT - alternate base(s): Comma separated list of alternate non-reference alleles. The...

(...)

GT : genotype, encoded as allele values separated by either of / or | . The allele values are 0 for the reference allele (what is in the REF field), 1 for the first allele listed in ALT, 2 for the second allele list in ALT and so on.

ADD COMMENTlink written 14 months ago by Pierre Lindenbaum124k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1864 users visited in the last hour