Keplen Meyer Curve Dropping to zero
1
1
Entering edit mode
8 months ago

Hi,

I am doing survival analysis like this:

clinical_data$time_to_death <- as.Date(clinical_data$Data.Ășltimo.follow.up,
format = "%Y-%m-%d") - as.Date(clinical_data$Data.Histologia, format = "%Y-%m-%d") km_fit <- survfit(Surv(time_to_death, Morte.S.N) ~ Disease, data=clinical_data) ggsurvplot(km_fit,pval=TRUE,risk.table=TRUE, conf.int = FALSE, legend.title = "Survival Analysis")  Time to death - time from the diagnosis until the last follow up of the patient Morte.S.N - 0 if patient is alive and 1 patient died Disease - The types of diseases For the green line it falls suddently from 0.25 to 0.0 and the last tree patients do not die at the same time, their time to death is 1826, 1891 and 1925 days. Is this plot correct, what possibly could be wrong? R SurvivalAnalysis SurvFit Surv • 673 views ADD COMMENT 0 Entering edit mode Show us the data for the green group. This may be due to an event happening after the censoring time for the green group. ADD REPLY 0 Entering edit mode Thank you! For the green group: > clinical_data$time_to_death
Time differences in days
[1]  357  287 1826  187  919 1178  787   22  211  252   76  751  344  148  804  361   75 1925 1325  796  385 1891  564  175
[25]  330
> clinical_data\$Morte.S.N
[1] 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 1

2
Entering edit mode
8 months ago
Shred ★ 1.4k

I think this is expected. You've not got any other observation after the last time point (1925) which is an event (death). Other groups have longer observations. Based on your data, the survival probability for the green group falls to zero after the last time point, given that no other observation has a longer survival.

You could choose a right censoring time at 1925 days to apply to all the other groups. This makes the observation more robust, given that you're comparing the same time interval across all the groups.

0
Entering edit mode

You could choose a right censoring time at 1925 days to apply to all the other groups. This makes the observation more robust, given that you're comparing the same time interval across all the groups.

May I know if there is any literature supporting this statement? I would like to learn more about this. I know this might be textbook knowledge, but anything that gives details on this topic will be helpful. Thanks!

1
Entering edit mode

It's mostly a choice dictated by the study design and aim, not a need. The log rank test is used to test the null hypothesis that there is no difference between the populations in the probability of an event at any time point. Citing source:

The logrank test is most likely to detect a difference between groups when the risk of an event is consistently greater for one group than another

In the OP case the survival probability is dictated by a drop in events in all the groups (all other observations are censored), where there's only an event in the green group at 1925. This case could be an outlier event (again, this must be verified) which could exacerbate the refusal of the null hypothesis.