Question: How does Plink (or any stats genetics tool) deal with missing genotypes in linear/logistic regression?
0
gravatar for William
2.3 years ago by
William4.3k
Europe
William4.3k wrote:

Say I have a gentoype matrix where I use the encoding

0 = HOM_REF
1 = HET
2 = HOM_ALT
NA = Missing genotype

For instance this dummy genotype matrix with 3 variants and 3 samples

Variant_1    0 1 1 
Variant_2    1 1 NA
Variant_3    NA 0 1 
etc

Do you need to first impute the genotype matrix to not have any missing genotypes(NA values)?

Or do you set the NA values to something like -9 or -999? This would influence the output of the linear / logistic regression heavily for variants with a lot of missing genotypes?

ADD COMMENTlink written 2.3 years ago by William4.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1710 users visited in the last hour