I've generated a few KM graphs from TCGA data. I am unsure of what the p-value provided by the function and the graphed chart indicates. I just looked through the survival.pdf from cran but am still unsure.
I currently believe that the p-value indicates with a 0.95 confidence that there is statistically significant variance of survival probabilities over time between the curves.
Here is a representative graph:
The data is being run through the following code to generate this graph:
#Generate survival object survival_object <- Surv(time = cesc_survival_data$survival_time, event = cesc_survival_data$survival_status) #Fit the survival data to a curve that is defined by the quartiles of zscore SLC2A2 quartiles_SLC2A2_expression_survival_curve <- survfit(survival_object ~ zscore_value_SLC2A2_quartile, data = cesc_survival_data) #Generate a graph of survival object/expression quartile curve of zscore SLC2A2 median quartiles ggsurvplot(quartiles_SLC2A2_expression_survival_curve, data = cesc_survival_data, #conf.int = TRUE, pval = TRUE, fun = "pct", risk.table = TRUE, size = 1, linetype = "strata", #palette = c("#E7B800", "#2E9FDF"), legend = "bottom", legend.title = "Expression Quartiles", legend.labs = c("Low Exp.", "Med. Low Exp.", "Med. High Exp.", "High Exp."), caption = "SLC2A2 Expression Quartile Survival Curve" )
If it is helpful the complete code can be found at: https://github.com/andrewnrdoig/RIndyResearch