Question: Regression models on genetic data
gravatar for melania 2282
17 months ago by
melania 2282100
melania 2282100 wrote:


I am using unconditional logistic regression to modelise genetic effect and genetic*environment exposure effect on my outcome.

My results a bit strange :

When modeling only main variants effect, I have no SNP associated

When modeling with interaction term exposure:SNP with additive term , I have a strong significant signal only for additive term ( towo SNP with p<<10e-8) and nothing for interaction.

I am using 0,1 and 2 codes for SNP (effect allele) and a continuous exposure variable.

I am working on case control study ( 2300 subjects) and testing 7000 SNPs

Can this be a reliable result ? How could this be explained ?

Thank you very much !

snp R • 327 views
ADD COMMENTlink modified 17 months ago by Lemire600 • written 17 months ago by melania 2282100

What, precisely, is your model formula? - outcome ~ exposure:SNP + SNP

Working with regression models can be difficult (and 'risky') - basically, it is possible to find a statistically significant p-value by messing around with the model formula; however, the models may be meaningless. Without also looking at the standard errors, the beta coefficients, and odds ratios, one cannot really make any interpretation based solely on the p-value. Also, should you be adjusting for population stratification?

ADD REPLYlink written 17 months ago by Kevin Blighe69k

Thank you Kevin, I am adjusting on PCA and this a result of metaanlysis of two different studies. my model is outcome ~ exposure:SNP + SNP+ exposure+ other cofactors than I did the metaanlysis from wich I get the significant result

ADD REPLYlink written 17 months ago by melania 2282100

Ah, a model formula like this:

outcome ~ exposure:SNP + SNP+ exposure the same as:

outcome ~ exposure * SNP

, i.e., it is a multiplicative model, also sometimes called the 'log-additive model'. Perhaps this may assist in the interpretation? As an example, I conducted a similar study in 2016 (but it was conditional regression with Family ID as the matched strata) and I also used a multiplicative model. How are the standard errors?

Lemire has provided an answer, below.

ADD REPLYlink written 17 months ago by Kevin Blighe69k
gravatar for Lemire
17 months ago by
Lemire600 wrote:

In your regression equation, you have the following terms:

beta_s * SNP + beta_i * SNP * exposure (ignoring the other ones you may have)

The estimate for beta_s (from which you derived your significance) is the slope of the effect of the SNP on your outcome when the exposure variable is equal to 0. That's how you need to interpret your result. The only thing you can say from your output is that your SNP has a significant effect when the exposure is 0. If your exposure would be equal to, e.g., 2, then the effect (slope) of the SNP would be beta_s+2*beta_i (which would have a different sd thus a different significance level). Don't overinterpret each coefficient taken separately.

ADD COMMENTlink written 17 months ago by Lemire600
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1658 users visited in the last hour