Question

Understanding Imputed Genotypes

4

Entering edit mode

10.8 years ago

Sheila ▴ 420

I have a data set of imputed genotypes and I noticed that the values are not simple 0s, 1s and 2s. Instead they are values like 1.998, 1.865, 1.997, 2.000.

Couple questions:

A) Could someone please explain why these genotypes are decimal values and not whole numbers?

B) And what does it mean if the genotype for SNP1 for Patient A 1.999 and the genotype for SNP1 for Patient B is 1.865?

imputation genetics • 12k views

ADD COMMENT • link updated 10.0 years ago by Kantale ▴ 140 • written 10.8 years ago by Sheila ▴ 420

1

Entering edit mode

It is difficult to answer to your question. You should give more details. Just to start: which software did you use? Impute, MACH, beagle? There are a lot of such tools. Also some example lines of input and output could help.

ADD REPLY • link 10.8 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

Hi Sheila. I agree with Fabio and fridhackery that more context would help answering your question. Since you are relatively new to the forum, I suggest you read How to ask Good Questions on Technical and Scientific Forums

ADD REPLY • link 10.8 years ago by Eric Normandeau 11k

Ram · Answer 1 · 2014-04-19

Hi,

It seems that the numbers that you have are dosages. Dosage is a simple linear transformation of the posterior genotype probabilities usually coming from imputation.

Assuming that you have a SNP: A/B and your genotype probabilities are:

A/A: 0.1
A/B: 0.4
B/B: 0.5

(They should all sum to 1.0)

Then the dosage for this SNP is: 0*A/A + 1*A/B + 2*B/B = 0.4 + 2*0.5 = 1.4

So the maximum dosage you can get is 2.0 (that is if the genotype probabilies of 0 for A/A, A/B and 1.0 for B/B)

score 2 · Answer 2 · 2013-07-02

Imputation of SNPs is a statistical guess at the likely genotype at a given locus based on the other information about the haplotype. Due to the genetic distance between flanking markers with a known state there is a likelihood of zero, one, two or more recombinations on the interval, resulting in an parental or recombinant haplotype. You should add more details to your post about how these numbers were generated (software, etc) for a better answer, but I think the basic answer is this:

Both Patient A and B most likely have a SNP of 2 at the locus for your SNP1. However, the data suggests that Patient A is more likely to have a 2 there than Patient B.