Question

cox proportional hazard model

4

Entering edit mode

6.2 years ago

liu4gre ▴ 210

Hi All,

I am wondering how to derive HR, CI and p values for each factor from cox model like follows. Using coxph only gives these values for groups such as BRCA status, TUmor stage ...

THanks.

enter image description here

survival cox • 9.9k views

ADD COMMENT • link updated 6.2 years ago by Kevin Blighe 87k • written 6.2 years ago by liu4gre ▴ 210

score 10 · Answer 1 · 2018-02-23

Update 13th March, 2019:

I posted a related tutorial: Survival analysis with gene expression

------------------------------

Update 24th September, 2018:

Note that the actual plot does not match the data, and neither do the stat values. Everything here is purely for display purposes only.

-------------------------------------

This would have been performed in the realm of survival analysis, looking at overall survival (OS) and progression-free survival (PFS), as you can probably see.

The starting point for the Cox Proportional Hazards Regression (Cox) is data in this format:

head(df)
    OS Event  Group
1 1065      0 group1
2    0      0 group2
3  883      0 group1
4   33      1 group2
5  790      0 group1
6 2517      1 group2

The columns are

OS: overall survival (days, weeks, months, years - just needs to be consistent)
Event: e.g. death, diagnosis, or some other event
Group: the categories of interest - can be anything such as ER status, IHC scores for CD20, race, or something else

Cox is run with coxph in R, and it needs to be performed on a survival object, e..g, produced by Surv

As per the table (above), there is a reference level for the category of interest, e.g., BRCA wild-type. Thus, we must also choose a reference category against which all other categories will be compared (here group1 is the reference):

df$Group <- factor(df$Group, levels=c("group1","group2","group3","group4"))
df$Group
  [1] group1 group2 group1 group2 group1 group2 group3 group2 group3 group2
 [11] group1 group4 group4 group3 group4 group4 group2 group3 group1 group3
 [21] group4 group4 group4 group1 group3 group3 group2 group1 group3 group4
 [31] group1 group1 group4 group2 group3 group3 group4 group3 group2 group4
 [41] group4 group3 group3 group4 group4 group4 group3 group2 group2 group1
 *et cetera*
Levels: group1 group2 group3 group4

Now we can actually generate hazard ratios (including CIs) and P values:

coxmodel <- coxph(Surv(time = OS, event = Event) ~ Group, data=df)
summary(coxmodel)
Call:
coxph(formula = Surv(time = OS, event = Event) ~ Group, data = df)

  n= 106, number of events= 106 

                coef exp(coef) se(coef)      z Pr(>|z|)
Groupgroup2  0.15929   1.17267  0.29957  0.532    0.595
Groupgroup3  0.03724   1.03794  0.27747  0.134    0.893
Groupgroup4 -0.14772   0.86267  0.28570 -0.517    0.605

            exp(coef) exp(-coef) lower .95 upper .95
Groupgroup2    1.1727     0.8528    0.6519     2.109
Groupgroup3    1.0379     0.9634    0.6025     1.788
Groupgroup4    0.8627     1.1592    0.4928     1.510

Concordance= 0.515  (se = 0.032 )
Rsquare= 0.011   (max possible= 0.999 )
Likelihood ratio test= 1.19  on 3 df,   p=0.7566
Wald test            = 1.18  on 3 df,   p=0.7575
Score (logrank) test = 1.19  on 3 df,   p=0.7563

The P values for each category are given by Pr(>|z|). The HRs are given by exp(coef). and you can probably guess the CIs. Just to be sure, here are the HRs with 2.5% and 97.5% CIs:

exp(confint(coxmodel))
                  2.5 %   97.5 %
Groupgroup2 0.6519067 2.109444
Groupgroup3 0.6025433 1.787944
Groupgroup4 0.4927892 1.510188

----------------------

Finally, you can then actually plot the Kaplan-Meier survival curve for this using a wrapper, km.coxph.plot:

km.coxph.plot(formula.s=Surv(time=OS, event = Event) ~ Group, data.s=df, mark.time=TRUE,
  x.label="Time (days)", y.label="Overall survival", main.title="",
  leg.text=c("Group1","Group2","Group3", "Group4"), leg.pos="topright", leg.bty="n", leg.inset=0,
  .col=c("limegreen","royalblue","purple","red1"),
  o.text="",
  .lty=c(1,1,1,1), .lwd=c(1.75,1.75,1.75,1.75), show.n.risk=TRUE, n.risk.step=500, n.risk.cex=0.8, verbose=FALSE)


mtext(side=3, line=-1, adj=-0.25, "Cox PH survival", cex=3)

mtext(side=3, line=-13, adj=0.95, "HR=2.95 (0.52, 16.62), p=0.2", cex=0.8, col="red")