Question

How to Interpret p-value from multi-curve Kaplan-Meier Graph

0

Entering edit mode

4.6 years ago

andrewnrdoig ▴ 10

I've generated a few KM graphs from TCGA data. I am unsure of what the p-value provided by the function and the graphed chart indicates. I just looked through the survival.pdf from cran but am still unsure.

I currently believe that the p-value indicates with a 0.95 confidence that there is statistically significant variance of survival probabilities over time between the curves.

Here is a representative graph: Example graph of multicuve km plot iwth p-value

The data is being run through the following code to generate this graph:

#Generate survival object
survival_object <- Surv(time = cesc_survival_data$survival_time, event = cesc_survival_data$survival_status)
#Fit the survival data to a curve that is defined by the quartiles of zscore SLC2A2
quartiles_SLC2A2_expression_survival_curve <- survfit(survival_object ~ zscore_value_SLC2A2_quartile, data = cesc_survival_data)
#Generate a graph of survival object/expression quartile curve of zscore SLC2A2 median quartiles
ggsurvplot(quartiles_SLC2A2_expression_survival_curve, data = cesc_survival_data, 
           #conf.int = TRUE,
           pval = TRUE,
           fun = "pct",
           risk.table = TRUE,
           size = 1,
           linetype = "strata",
           #palette = c("#E7B800", "#2E9FDF"),
           legend = "bottom",
           legend.title = "Expression Quartiles",
           legend.labs = c("Low Exp.", "Med. Low Exp.", "Med. High Exp.", "High Exp."),
           caption = "SLC2A2 Expression Quartile Survival Curve"
)

If it is helpful the complete code can be found at: https://github.com/andrewnrdoig/RIndyResearch

R survival analysis kaplan-meier graph • 5.5k views

ADD COMMENT • link updated 4.6 years ago by Kevin Blighe 89k • written 4.6 years ago by andrewnrdoig ▴ 10

score 2 · Accepted Answer · 2020-12-02

2

Entering edit mode

4.6 years ago

Kevin Blighe 89k

The default p-value that is calculated is the log-rank p-value. It is testing the null hypothesis that each of your strata (survival groups) has the same survival probability.

I show this in a previous answer, here: A: survfit(Surv()) P-value interpretation for 3 survival curves?

In my other answer, I also show how you can use the coxph() function to generate p-values for each pairwise stratum / curve.

Looking at your data, it p-value is not statistically significant and your data-points are sparse after ~75 days.

Kevin

ADD COMMENT • link 4.6 years ago by Kevin Blighe 89k

1

Entering edit mode

Thanks Kevin, I appreciate the links also!

ADD REPLY • link 4.6 years ago by andrewnrdoig ▴ 10