The default p-value that is calculated by `survfit()`

is the log rank p-value from the score test, which is one of the most oft-quoted p-values for survival data.

If you want to obtain a p-value for each individual stratum compared to the base / reference stratum, then you can use the Cox proportional hazards model, which will produce the same log rank p-value as `Survfit()`

when ties are 'exact':

## load AML data

```
require(survival)
require(survminer)
aml
time status x
1 9 1 Maintained
2 13 1 Maintained
3 13 0 Maintained
4 18 1 Maintained
5 23 1 Maintained
6 28 0 Maintained
7 31 1 Maintained
8 34 1 Maintained
9 45 0 Maintained
10 48 1 Maintained
11 161 0 Maintained
12 5 1 Nonmaintained
13 5 1 Nonmaintained
14 8 1 Nonmaintained
15 8 1 Nonmaintained
16 12 1 Nonmaintained
17 16 0 Nonmaintained
18 23 1 Nonmaintained
19 27 1 Nonmaintained
20 30 1 Nonmaintained
21 33 1 Nonmaintained
22 43 1 Nonmaintained
23 45 1 Nonmaintained
```

## let's change the data to add a third stratum, '*SuperMaintained*'

```
aml$x <- as.character(aml$x)
aml[10,3] <- 'SuperMaintained'
aml[11,3] <- 'SuperMaintained'
aml[22,3] <- 'SuperMaintained'
aml[23,3] <- 'SuperMaintained'
aml$x <- factor(aml$x, levels = c('Nonmaintained','Maintained','SuperMaintained'))
aml
time status x
1 9 1 Maintained
2 13 1 Maintained
3 13 0 Maintained
4 18 1 Maintained
5 23 1 Maintained
6 28 0 Maintained
7 31 1 Maintained
8 34 1 Maintained
9 45 0 Maintained
10 48 1 SuperMaintained
11 161 0 SuperMaintained
12 5 1 Nonmaintained
13 5 1 Nonmaintained
14 8 1 Nonmaintained
15 8 1 Nonmaintained
16 12 1 Nonmaintained
17 16 0 Nonmaintained
18 23 1 Nonmaintained
19 27 1 Nonmaintained
20 30 1 Nonmaintained
21 33 1 Nonmaintained
22 43 1 SuperMaintained
23 45 1 SuperMaintained
```

With the `factor()`

command, I also set the reference level to '*Nonmaintained*'.

## use `survfit()`

```
fit <- survfit(
Surv(time, status) ~ x,
data = aml)
surv_pvalue(fit)
variable pval method pval.txt
x 0.005309417 Log-rank p = 0.0053
ggsurvplot(
fit,
conf.int = FALSE,
surv.median.line = c('hv'),
data = aml,
pval = TRUE,
pval.method = TRUE,
risk.table = FALSE)
```

So, the log rank p-value is 0.005309417; Survival is worse for the '*Nonmaintained*' stratum.

## use `coxph()`

```
coxfit <- coxph(
Surv(time, status) ~ x,
data = aml,
ties = 'exact')
summary(coxfit)
Call:
coxph(formula = Surv(time, status) ~ x, data = aml, ties = "exact")
n= 23, number of events= 18
coef exp(coef) se(coef) z Pr(>|z|)
xMaintained -1.12456 0.32479 0.58806 -1.912 0.05584 .
xSuperMaintained -2.60439 0.07395 0.92232 -2.824 0.00475 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
exp(coef) exp(-coef) lower .95 upper .95
xMaintained 0.32479 3.079 0.10258 1.0284
xSuperMaintained 0.07395 13.523 0.01213 0.4508
Concordance= 0.729 (se = 0.054 )
Likelihood ratio test= 10.87 on 2 df, p=0.004
Wald test = 8.58 on 2 df, p=0.01
Score (logrank) test = 10.48 on 2 df, p=0.005
```

So, the p-values for each stratum are:

- Maintained vs Nonmaintained, 0.05584
- SuperMaintained vs Nonmaintained, 0.00475

The hazard ratios (HRs) are given by:

- exp(coef), HR
- exp(-coef), HR flipped the other way
- lower .95, lower 95% CI of the HR
- upper .95, upper 95% CI of the HR

## check that the log rank p-values are the same in both models

The log rank p-value from the Cox model can be accessed by: `summary(coxfit)$sctest[3]`

```
round(summary(coxfit)$sctest[3], digits = 9) == round(surv_pvalue(fit)[,2], digits = 9)
pvalue
TRUE
```

Kevin

Amazing, thank you so much!!

You are welcome, Olive.

This just helped me too - thank you

The log rank p value of 0.0053 in the graph above take into consideration all the 3 groups(Maintained, Nonmaintained, SuperMaintained)? What does one p value for 3 groups tell? I know later on you showed pvalues between 2 groups with reference as Nonmaintained.