Question: question about survival analysis
gravatar for tujuchuanli
21 months ago by
tujuchuanli60 wrote:

Hi, I choose two genes and perform survival analysis on these two genes. The equation here is coxph(Surv(time,censor) ~ exprs1+exprs2)”, where exprs1 and exprs2 are the TMM normalized expression value for gene1 and gene2. Time is survival time (for dead patients) or last follow up time (for alive patients), censor is dead or alive for each patient.

You can see this plot by clicking this url (

You can see that the group with low expression of both gene1 and gene2 is the worst. And the group with high expression of both gene1 and gene2 is the best (using the median of expression value as cutoff). However, group with high expression of gene2 and low expression of gene1 is between the lines of the worst and the best (group with high expression of gene1 and low expression of gene2 is the same).

Here is my question. Can I say that there is an interaction between these two genes on survival analysis? If the answer is yes, I can change my equation to coxph(Surv(time,censor) ~ exprs1*exprs2) to see the interaction?


ADD COMMENTlink modified 21 months ago • written 21 months ago by tujuchuanli60

Hi Kevin,

It is very nice to see you~~.

if I use "coxph(Surv(time,censor) ~ exprs1)" or "coxph(Surv(time,censor) ~ exprs2)" or "coxph(Surv(time,censor) ~ exprs1+exprs2)", the three results are all significant.

However, there are something I didn`t explain clearly in my post. the plot what you see is ploted by "km.coxph.plot(formula.s=Surv(time=time, died) ~ group)" the group here is a factor value which defined four groups in the plot. The cutoff used here is the median value of logCPM counts for gene1 and gene2. below is the output when I use "summary" function to summarize the coxmodel

ADD REPLYlink modified 21 months ago • written 21 months ago by tujuchuanli60

Hello tujuchuanli - nice to see you, too.

I see... I thought that your groups were somehow defined by exprs1 and exprs2. In your case, those p-values are produced by comparing the following:

  • group2 versus group1
  • group3 versus group1
  • group4 versus group1

So, group1 is regarded as the reference level. For the model, generally, you could report the Score (logrank) test p-value.

The p-values are very low. I will not ask what are these genes, though.

ADD REPLYlink written 21 months ago by Kevin Blighe65k

Thank you for your replying, Kevin~~

Following your suggestion that interaction is a bit misleading, I`ve changed my words. I considered that these gene pairs as the pairs which combination of two genes with different expression level correlated with different survival outcome, is it better?

BTW, do you think that it is somehow interesting thing?

ADD REPLYlink written 21 months ago by tujuchuanli60

Please use ADD REPLY/ADD COMMENT when responding to existing posts to keep the thread logically organized.

ADD REPLYlink written 21 months ago by genomax90k
gravatar for Kevin Blighe
21 months ago by
Kevin Blighe65k
Kevin Blighe65k wrote:

The lines 'cross over', but using the word 'interaction' in this context is a bit misleading - the level of cross-over is also minimal. Also, when you have ~ exprs1+exprs2, you are indicating an additive effect of these genes and that these genes have equal weighting. So, it makes sense for cross-over to occur when you compare HIGH/LOW and LOW/HIGH How does it look if you just run:

~ exprs1

~ exprs2

Also, using the TMM counts, which are on the negative binomial scale, may be biasing the result. Assuming that you have used EdgeR, you should be inputting logCPM counts.

ADD COMMENTlink written 21 months ago by Kevin Blighe65k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1254 users visited in the last hour