0
21 months ago by
tujuchuanli60
tujuchuanli60 wrote:

Hi, I choose two genes and perform survival analysis on these two genes. The equation here is coxph(Surv(time,censor) ~ exprs1+exprs2)”, where exprs1 and exprs2 are the TMM normalized expression value for gene1 and gene2. Time is survival time (for dead patients) or last follow up time (for alive patients), censor is dead or alive for each patient.

You can see this plot by clicking this url (https://pan.baidu.com/s/1BDrmHAuAmW6fiyfsSJcHyw).

You can see that the group with low expression of both gene1 and gene2 is the worst. And the group with high expression of both gene1 and gene2 is the best (using the median of expression value as cutoff). However, group with high expression of gene2 and low expression of gene1 is between the lines of the worst and the best (group with high expression of gene1 and low expression of gene2 is the same).

Here is my question. Can I say that there is an interaction between these two genes on survival analysis? If the answer is yes, I can change my equation to coxph(Surv(time,censor) ~ exprs1*exprs2) to see the interaction?

Thanks

modified 21 months ago • written 21 months ago by tujuchuanli60

Hi Kevin,

It is very nice to see you~~.

if I use "coxph(Surv(time,censor) ~ exprs1)" or "coxph(Surv(time,censor) ~ exprs2)" or "coxph(Surv(time,censor) ~ exprs1+exprs2)", the three results are all significant.

However, there are something I didn`t explain clearly in my post. the plot what you see is ploted by "km.coxph.plot(formula.s=Surv(time=time, died) ~ group)" the group here is a factor value which defined four groups in the plot. The cutoff used here is the median value of logCPM counts for gene1 and gene2. below is the output when I use "summary" function to summarize the coxmodel

https://pan.baidu.com/s/1D2z7715PFHLhwLA7_go0AA

Hello tujuchuanli - nice to see you, too.

I see... I thought that your groups were somehow defined by `exprs1` and `exprs2`. In your case, those p-values are produced by comparing the following:

• group2 versus group1
• group3 versus group1
• group4 versus group1

So, `group1` is regarded as the reference level. For the model, generally, you could report the Score (logrank) test p-value.

The p-values are very low. I will not ask what are these genes, though.

Following your suggestion that interaction is a bit misleading, I`ve changed my words. I considered that these gene pairs as the pairs which combination of two genes with different expression level correlated with different survival outcome, is it better?

BTW, do you think that it is somehow interesting thing?

Please use `ADD REPLY/ADD COMMENT` when responding to existing posts to keep the thread logically organized.

0
21 months ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

The lines 'cross over', but using the word 'interaction' in this context is a bit misleading - the level of cross-over is also minimal. Also, when you have `~ exprs1+exprs2`, you are indicating an additive effect of these genes and that these genes have equal weighting. So, it makes sense for cross-over to occur when you compare `HIGH`/`LOW` and `LOW`/`HIGH` How does it look if you just run:

``````~ exprs1

~ exprs2
``````

Also, using the TMM counts, which are on the negative binomial scale, may be biasing the result. Assuming that you have used EdgeR, you should be inputting logCPM counts.