Question: Understanding Imputed Genotypes
gravatar for Sheila
7.0 years ago by
United States
Sheila320 wrote:

I have a data set of imputed genotypes and I noticed that the values are not simple 0s, 1s and 2s. Instead they are values like 1.998, 1.865, 1.997, 2.000.

Couple questions:

A) Could someone please explain why these genotypes are decimal values and not whole numbers?

B) And what does it mean if the genotype for SNP1 for Patient A 1.999 and the genotype for SNP1 for Patient B is 1.865?

imputation genetics • 6.1k views
ADD COMMENTlink modified 6.2 years ago by Kantale120 • written 7.0 years ago by Sheila320

It is difficult to answer to your question. You should give more details. Just to start: which software did you use? Impute, MACH, beagle? There are a lot of such tools. Also some example lines of input and output could help.

ADD REPLYlink written 7.0 years ago by Fabio Marroni2.5k

Hi Sheila. I agree with Fabio and fridhackery that more context would help answering your question. Since you are relatively new to the forum, I suggest you read How to ask Good Questions on Technical and Scientific Forums

ADD REPLYlink written 7.0 years ago by Eric Normandeau10k
gravatar for Kantale
6.2 years ago by
Groningen, Netherlands
Kantale120 wrote:


It seems that the numbers that you have are dosages. Dosage is a simple linear transformation of the posterior genotype probabilities usually coming from imputation.

Assuming that you have a SNP: A/B and your genotype probabilities are:

A/A: 0.1
A/B: 0.4
B/B: 0.5

(They should all sum to 1.0)

Then the dosage for this SNP is: 0*A/A + 1*A/B + 2*B/B = 0.4 + 2*0.5 = 1.4

So the maximum dosage you can get is 2.0 (that is if the genotype probabilies of 0 for A/A, A/B and 1.0 for B/B)

ADD COMMENTlink modified 6 months ago by RamRS27k • written 6.2 years ago by Kantale120
gravatar for fridhackery
7.0 years ago by
fridhackery140 wrote:

Imputation of SNPs is a statistical guess at the likely genotype at a given locus based on the other information about the haplotype. Due to the genetic distance between flanking markers with a known state there is a likelihood of zero, one, two or more recombinations on the interval, resulting in an parental or recombinant haplotype. You should add more details to your post about how these numbers were generated (software, etc) for a better answer, but I think the basic answer is this:

Both Patient A and B most likely have a SNP of 2 at the locus for your SNP1. However, the data suggests that Patient A is more likely to have a 2 there than Patient B.

ADD COMMENTlink written 7.0 years ago by fridhackery140
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1190 users visited in the last hour