Question

Why median is NA for some of the group outcomes in survival analysis?

0

Entering edit mode

5.0 years ago

Biologist ▴ 290

I have dataframe data with columns like FollowUpDays, patient_vital_status and high, low information of a gene.

I'm trying to do survival analysis using the Followup information, patient_vital_status and the information of gene. I'm using like below:

surv_diff <- survdiff(Surv(FollowUpDays, patient_vital_status) ~ ENSG00000001460, 
                      data = data)
surv_diff

Call:
survdiff(formula = Surv(FollowUpDays, patient_vital_status) ~ 
    ENSG00000001460, data = data)

                       N Observed Expected (O-E)^2/E (O-E)^2/V
ENSG00000001460=high 332       57     70.5      2.58      5.99
ENSG00000001460=low  264       67     53.5      3.40      5.99

 Chisq= 6  on 1 degrees of freedom, p= 0.01

From the above I could say that log rank test for difference in survival gives a p-value of p = 0.01, indicating that the Expression groups high and low differ significantly in survival.

To check the median of both the groups which tells us which group is good or bad for prognosis, I used like below:

library(survival)
fit <- survfit(Surv(FollowUpDays, patient_vital_status) ~ ENSG00000001460,
                       data = data)

print(fit)

Call: survfit(formula = Surv(FollowUpDays, patient_vital_status) ~ 
    ENSG00000001460, data = data)

                       n events median 0.95LCL 0.95UCL
ENSG00000001460=high 332     57     NA    2134      NA
ENSG00000001460=low  264     67   1741    1503      NA

From the above I see that median of high group is NA and 0.95UCL is also NA for both the groups.

If the median of one of the group is NA how can I say which group is worse for prognosis? Can anyone tell about these NA's here.

Any help is appreciated. thanq

RNA-Seq survival kaplan-meier clinical r • 4.2k views

ADD COMMENT • link updated 5.0 years ago by Jean-Karim Heriche 27k • written 5.0 years ago by Biologist ▴ 290

score 1 · Answer 1 · 2019-04-11

1

Entering edit mode

5.0 years ago

Jean-Karim Heriche 27k

The median here is defined as the point at which the survival curve reaches 0.5 (see the doc to the print.survfit() function). Try plot(fit) to see if the median can be estimated. NA would suggest that in your data the median survival time is not reached.

ADD COMMENT • link 5.0 years ago by Jean-Karim Heriche 27k

0

Entering edit mode

Thanq. I did the plot and I could see median for low group, but no median for high group. The plot looks like this survival plot I actually want to know which group is worse or good for prognosis without plotting. If there is a group (high or low) with median NA does it mean that group is good for prognosis?

ADD REPLY • link 5.0 years ago by Biologist ▴ 290

0

Entering edit mode

You don't have enough data for the red curve to reach 0.5 so the median can't be estimated. All you can say in that case is that it is greater than the last point you have.

ADD REPLY • link 5.0 years ago by Jean-Karim Heriche 27k