I want to identify prognostic genes by survival analysis in TCGA BRCA dataset. Here I basicly followed the way of a previous study (https://peerj.com/articles/1499/). My plan is to analyze gene one by one and pick genes with significantly cox pvalue (p<0.05).
The survival model is below (using survival package in R)
coxmodel <- coxph(Surv(time,censor) ~ exprs) summary.coxmodel <- summary(coxmodel) coef <- coef(summary.coxmodel) coef.pvalue <- coef(summary.coxmodel)
Here time is survival time. Censor is died or not died. exprs is gene expression value (RNA-seq data, RPKM value).
Then I want to display some of genes with significant cox pvalue by Kaplan plot. Basicly I fellowed post by Kevin (cox proportional hazard model, by the way, Kevin. I hope you can see this post and give me some precious suggestions). I use median of gene expression as cutoff to divide samples into two groups (group with high exprs and low exprs).
The plot give me a Logrank p-value, which is always much bigger than cox pvalue (usually 100 times, I try several genes).
My question is how I can get perfect plot to fit my cox pvalue? or I only have to try several cutoff to get best fitting plot?