Question: Questions about survival analysis
gravatar for tujuchuanli
2.3 years ago by
tujuchuanli80 wrote:

Hi, all

I want to identify prognostic genes by survival analysis in TCGA BRCA dataset. Here I basicly followed the way of a previous study ( My plan is to analyze gene one by one and pick genes with significantly cox pvalue (p<0.05).

The survival model is below (using survival package in R)

coxmodel <- coxph(Surv(time,censor) ~ exprs)
summary.coxmodel <- summary(coxmodel)
coef <- coef(summary.coxmodel)[1]
coef.pvalue <- coef(summary.coxmodel)[5]

Here time is survival time. Censor is died or not died. exprs is gene expression value (RNA-seq data, RPKM value).

Then I want to display some of genes with significant cox pvalue by Kaplan plot. Basicly I fellowed post by Kevin (cox proportional hazard model, by the way, Kevin. I hope you can see this post and give me some precious suggestions). I use median of gene expression as cutoff to divide samples into two groups (group with high exprs and low exprs).

The plot give me a Logrank p-value, which is always much bigger than cox pvalue (usually 100 times, I try several genes).

My question is how I can get perfect plot to fit my cox pvalue? or I only have to try several cutoff to get best fitting plot?

survival analysis R • 1.0k views
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by tujuchuanli80

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 2.3 years ago by WouterDeCoster44k

Thanks, I will try it next time.

ADD REPLYlink written 2.3 years ago by tujuchuanli80

Your worry appears to be that the P values are just very different - is this correct? Are your sample numbers low in either of your groups being compared (or imbalanced?)?

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by Kevin Blighe66k

Hi Kevin, Nice to see you again

Here, I have two groups of genes. I want to check the number of prognostic genes in group A and B (prognostic genes are defined as cox-pvalue<=0.05). Actually the percentage of prognostic genes in group A is higher than B (12% vs 6% in BRCA dataset).

This is an overview of data. Next I want to display some of genes by Kaplan plot and find above problem. Since I am just a newbie to survival analysis, I don`t know how to deal with it.

Can I say a gene is a prognostic genes even if Logrank p-value is not significant but cox-pvalue is signifcant?

By the way, Kevin. Could you please check my other two posts (C: question about identifying differential expressed genes in TCGA and and give me some suggestions? Your suggestions are very important to me.


ADD REPLYlink written 2.3 years ago by tujuchuanli80

Thanks, Kevin. I will check the post. Thank you again.

ADD REPLYlink written 2.3 years ago by tujuchuanli80
gravatar for Kevin Blighe
2.3 years ago by
Kevin Blighe66k
Kevin Blighe66k wrote:

To 'intimately' understand the log rank and Cox proportional hazards tests, I would encourage you to post on

From my general understanding: the log rank, Wald, and likelihood ratio tests are just comparing the different arms of your survival 'curve'. The Cox test, then, will do the same but take into account any adjustments that you are making in the model.

For example, we can build a Cox model and include various covariates in the model, such as smoking status, BMI, exposure to allergens, etc.. The Cox model will analyse the survival curves and 'adjust' for these covariates when reporting P vales and Hazard Ratios, whilst the log rank test will not. So, the log rank test can be misleading.

ADD COMMENTlink written 2.3 years ago by Kevin Blighe66k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 906 users visited in the last hour