Using Linear Regression on Genotype and Expression data
Entering edit mode
6 weeks ago

Hi all,

I have studied many sources like this and this that try to relate the gene expression of a gene to the variants(SNPs). but in all of them, I have a question that they didn't answer. My question is this: As we have 3 types of genotype ( "0" which refers to 0 minor allele count (ref/ref), "1" refers to 1 minor allele count (ref/alt) , "2" refers to 2 minor allele count (alt/alt) ) , and if we just considered SNPs within 100 Kbp upstream and downstream of TSS(Transcription factor site) we may have about ~20 SNPs for each gene, so there would become so colinearity between nonindependent variables( which is genotype).

this is a sample table that I will run Linear Regression ( function "lm" in R) :

            SNP1         SNP2           SNP3             SNP4    ...   Gene expression
   donor1    0            1              0                1                 3.5
   donor2    0            1              0                1                 4.5
   donor3    0            0              0                0                 3.0
   donor4    1            1              0                1                 5.5
   donor5    0            1              0                1                 1.5

I have ~400 donors and many donors are like donor1 and donor5, their genotypes in SNPs are the same. so when I run linear regression this warning arise "prediction from a rank-deficient fit may be misleading"

so what should I do? Am I doing something wrong or no?

thanks alot

Regression SNP Machine Learning Genotype • 290 views
Entering edit mode

Can you show the model that you are fitting?

Entering edit mode

I am doing this :

model <- lm ( gene_expression ~ . , data = my_data_train)
pred_lm <- predict(model, newdata = my_data_test)
Entering edit mode
5 weeks ago
PeterKW ▴ 40

This is most likely a warning because you have some colinear covariates e.g. SNP2 and SNP 4 in the sample table you gave. There are various other reasons given here. I hope this will help, just give the different answers a good thought.


Login before adding your answer.

Traffic: 1549 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6