Question: cox proportional hazard model
2
gravatar for liu4gre
3 months ago by
liu4gre140
United States
liu4gre140 wrote:

Hi All,

I am wondering how to derive HR, CI and p values for each factor from cox model like follows. Using coxph only gives these values for groups such as BRCA status, TUmor stage ...

THanks.

enter image description here

survival cox • 275 views
ADD COMMENTlink modified 3 months ago by Kevin Blighe21k • written 3 months ago by liu4gre140
5
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe21k
University College London Cancer Institute
Kevin Blighe21k wrote:

This would have been performed in the realm of survival analysis, looking at overall survival (OS) and progression-free survival (PFS), as you can probably see.

The starting point for the Cox Proportional Hazards Regression (Cox) is data in this format:

head(df)
    OS Censor  Group
1 1065      0 group1
2    0      0 group2
3  883      0 group1
4   33      1 group2
5  790      0 group1
6 2517      1 group2

The columns are

  • OS: overall survival (days, weeks, months, years - just needs to be consistent)
  • Censor: censoring, time2, or event (e.g. death)
  • Group: the categories of interest - can be anything such as ER status, IHC scores for CD20, race, or something else

Cox is run with coxph in R, and it needs to be performed on a survival object, e..g, produced by Surv

As per the table (above), there is a reference level for the category of interest, e.g., BRCA wild-type. Thus, we must also choose a reference category against which all other categories will be compared (here group1 is the reference):

df$Group <- factor(df$Group, levels=c("group1","group2","group3","group4"))
df$Group
  [1] group1 group2 group1 group2 group1 group2 group3 group2 group3 group2
 [11] group1 group4 group4 group3 group4 group4 group2 group3 group1 group3
 [21] group4 group4 group4 group1 group3 group3 group2 group1 group3 group4
 [31] group1 group1 group4 group2 group3 group3 group4 group3 group2 group4
 [41] group4 group3 group3 group4 group4 group4 group3 group2 group2 group1
 [51] group1 group4 group3 group1 group2 group2 group1 group4 group4 group4
 [61] group1 group1 group4 group1 group2 group2 group1 group3 group2 group3
 [71] group3 group2 group4 group3 group1 group3 group1 group3 group4 group2
 [81] group3 group2 group1 group3 group3 group2 group1 group3 group1 group1
 [91] group4 group3 group3 group3 group4 group2 group4 group4 group4 group4
[101] group2 group3 group4 group4 group3 group3
Levels: group1 group2 group3 group4

Now we can actually generate hazard ratios (including CIs) and P values:

coxmodel <- coxph(Surv(OS, Censor) ~ Group, data=df)
summary(coxmodel)
Call:
coxph(formula = Surv(OS, Censor) ~ Group, data = df)

  n= 106, number of events= 106 

                coef exp(coef) se(coef)      z Pr(>|z|)
Groupgroup2  0.15929   1.17267  0.29957  0.532    0.595
Groupgroup3  0.03724   1.03794  0.27747  0.134    0.893
Groupgroup4 -0.14772   0.86267  0.28570 -0.517    0.605

            exp(coef) exp(-coef) lower .95 upper .95
Groupgroup2    1.1727     0.8528    0.6519     2.109
Groupgroup3    1.0379     0.9634    0.6025     1.788
Groupgroup4    0.8627     1.1592    0.4928     1.510

Concordance= 0.515  (se = 0.032 )
Rsquare= 0.011   (max possible= 0.999 )
Likelihood ratio test= 1.19  on 3 df,   p=0.7566
Wald test            = 1.18  on 3 df,   p=0.7575
Score (logrank) test = 1.19  on 3 df,   p=0.7563

The P values for each category are given by Pr(>|z|). The HRs are given by exp(coef). and you can probably guess the CIs. Just to be sure, here are the HRs with 2.5% and 97.5% CIs:

exp(confint(coxmodel))
                  2.5 %   97.5 %
Groupgroup2 0.6519067 2.109444
Groupgroup3 0.6025433 1.787944
Groupgroup4 0.4927892 1.510188

----------------------

Finally, you can then actually plot the Kaplan-Meier survival curve for this using a wrapper, km.coxph.plot:

km.coxph.plot(formula.s=Surv(time=OS, Censor) ~ Group, data.s=df, mark.time=TRUE,
  x.label="Time (days)", y.label="Overall survival", main.title="",
  leg.text=c("Group1","Group2","Group3", "Group4"), leg.pos="topright", leg.bty="n", leg.inset=0,
  .col=c("limegreen","royalblue","purple","red1"),
  o.text="",
  .lty=c(1,1,1,1), .lwd=c(1.75,1.75,1.75,1.75), show.n.risk=TRUE, n.risk.step=500, n.risk.cex=0.8, verbose=FALSE)


mtext(side=3, line=-1, adj=-0.25, "Cox PH survival", cex=3)

mtext(side=3, line=-13, adj=0.95, "HR=2.95 (0.52, 16.62), p=0.2", cex=0.8, col="red")

surv

ADD COMMENTlink modified 3 months ago • written 3 months ago by Kevin Blighe21k

Thanks, Kevin. It looks cool. But it is for univariable test? How about multivariable, e.g. BRCA status, Tumor stage and Residual tumor?

ADD REPLYlink written 3 months ago by liu4gre140
1

If you want to further subdivide your cohort, then you could just create new categories, like, for example:

BRCA1mutation.StageI
BRCA1mutation.StageII
BRCA1mutation.StageIII
BRCA1mutation.StageIV
BRCA2mutation.StageI
BRCA2mutation.StageII
BRCA2mutation.StageIII
BRCA2mutation.StageIV

You can also adjust for other factors / covariates in the Cox model, or do interactions, which is an alternative hypothesis to the above but still useful:

coxph(Surv(OS, Censor) ~ Group + TumourStage, data=df)
coxph(Surv(OS, Censor) ~ Group:TumourStage, data=df)

I recall having a conversation in this regard last year amongst a group of statisticians (i.e., interaction terms in a Cox model - from what I recall, it's a somewhat unexplored area).

ADD REPLYlink modified 3 months ago • written 3 months ago by Kevin Blighe21k

Thanks. But it looks the reference for each category can not be defined? E.g. Stage I as reference in Tumor stage and BRCA wildtype in BRCA status?

Or in above table, the reference was defined individually, which means three testes were performed to generate the table of PFS or OS?

ADD REPLYlink written 3 months ago by liu4gre140
1

In that table that you posted, the reference levels are 'BRCA wild-type', 'Tumour Stage II', and 'Residual Tumour 0'. They did not have data for Stage I or they just did not consider it for the study.

They would have used 6 tests (3 for Os; 3 for PFS) to generate the results in that table.

ADD REPLYlink written 3 months ago by Kevin Blighe21k
1

These would have been the models:

coxph(Surv(OS, Censor) ~ BRCAstatus, data=df.OS)
coxph(Surv(OS, Censor) ~ TumourStage, data=df.OS)
coxph(Surv(OS, Censor) ~ ResidualTumour, data=df.Os)

coxph(Surv(PFS, Censor) ~ BRCAstatus, data=df.PFS)
coxph(Surv(PFS, Censor) ~ TumourStage, data=df.PFS)
coxph(Surv(PFS, Censor) ~ ResidualTumour, data=df.PFS)
ADD REPLYlink written 3 months ago by Kevin Blighe21k

Great, thanks.

Sorry for one more question. df.OS you have put into the models are: BRCA1mutation BRCA2mutation BRCA1methylation,

or:

BRCA1mutation.StageII BRCA1mutation.StageIII and IV BRCA2mutation.StageII BRCA2mutation.StageIII and IV

ADD REPLYlink written 3 months ago by liu4gre140
1

For the first OS model, the data would be:

head(df.OS)
OS    Censor  BRCAstatus
1065  0       BRCA1mutation
0     0       BRCA1mutation
883   0       BRCA1methylation
33    1       BRCAwildtype
...

levels(df.OS$BRCAstatus)
BRCAwildtype, BRCA1mutation, BRCA2mutation, BRCA1methylation

For PFS, it would be:

head(df.PFS)
    PFS  Censor  BRCAstatus
    501  0       BRCA1mutation
    0    0       BRCA1mutation
    38   0       BRCA1methylation
    10   1       BRCAwildtype
    ...

Then, there are 4 more different models:

  1. OS with Tumour Stage
  2. OS with Residual Tumour
  3. PFS with Tumour Stage
  4. PFS with Residual Tumour

Hope that this helps.

ADD REPLYlink written 3 months ago by Kevin Blighe21k
1

Thanks. Nice tutorial.

ADD REPLYlink written 3 months ago by liu4gre140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1362 users visited in the last hour