Question: Regression using genotype on expression of genes
gravatar for wstla27
4 weeks ago by
wstla270 wrote:

I am a total beginner on bioinformatics, so this question might be very very trivial. Right now, I need to run a linear regression using genotype information on the expression of some genes. I have vcf files for all the chromosomes. I am having a hard time understanding how should I feed the genotype information (0s and 1s) to the regression model. Do I use the allele frequency or should I just use the 0s and 1s? Also, regarding the expression of genes, I have a list of the id of the genes, there related snp_ids, r-values, and p-values. In order to feed into the linear model, what kind of expression value should I use?

(I am having hard time understanding these because from all the stats courses, we just simply use values and numbers. But for the biology information, there are only 0s and 1s. I can't seem to figure out how to do a regression on 0s and 1s and find their association.)

Thank you so much for your helps!

snp linear regression gene • 86 views
ADD COMMENTlink modified 4 weeks ago by Kevin Blighe46k • written 4 weeks ago by wstla270
gravatar for Kevin Blighe
4 weeks ago by
Kevin Blighe46k
Kevin Blighe46k wrote:

I am not sure what you are aiming to do, exactly. However, you should attempt to get your VCF data in an 'analysis-ready' format. This will involve summarising it to allele tallies (continuous) or maintaining it as categorical variables (for Ref, Heterozygous Alt, and Homozygous Alt).

After that, you can do a multinomial logistic regression or a linear regression:

glm(Variant ~ GeneExpression, data = mydata, family = binomial(link = 'logit')) # multinomial regression
lm(GeneExpression ~ Variant, data = mydata) # linear regression


ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Kevin Blighe46k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1699 users visited in the last hour