Question: Why survival plots look different with same data?
0
gravatar for Biologist
8 months ago by
Biologist150
Biologist150 wrote:

Hello,

The survival plot based on Best separation of high and low expression samples of GPAM with Expression cutoff 23.6 FPKM looks like below (This plot is from Human Protein Atlas database)

Survival Plot between high and low samples of GPAM Expression

I took the GPAM FPKM data given in the above database and merged with Clinical data. Everything is stored in a dataframe df

head(df)

  times bcr_patient_barcode patient.vital_status      FPKM
1   724        TCGA-2Y-A9GS                    1      30.3
2  1624        TCGA-2Y-A9GT                    1       5.6
3  1569        TCGA-2Y-A9GU                    0      26.6
4  2532        TCGA-2Y-A9GV                    1      18.4
5  1271        TCGA-2Y-A9GW                    1       4.7
6  2442        TCGA-2Y-A9GX                    0      19.4

I used survminer package for the cutpoint to divide low and high expression samples.

library(survminer)

surv_rnaseq.cut <- surv_cutpoint(
  df,
  time = "times",
  event = "patient.vital_status",
  variables = c("FPKM")
)

 summary(surv_rnaseq.cut)
          cutpoint statistic
GPAM_FPKM     23.6  2.834408

Then catogarization is done.

surv_rnaseq.cat <- surv_categorize(surv_rnaseq.cut)

Then to plot the data I did like below:

library(survival)
library(RTCGA)
fit <- survfit(Surv(times, patient.vital_status) ~ FPKM,
                data = surv_rnaseq.cat)
pdf("Survival_high_vs_low.pdf", width = 10, height = 10)
ggsurvplot(
  fit,                     # survfit object with calculated statistics.
  risk.table = TRUE,       # show risk table.
  pval = TRUE,             # show p-value of log-rank test.
  conf.int = TRUE,         # show confidence intervals for 
  # point estimaes of survival curves.
  xlim = c(0,3000),        # present narrower X axis, but not affect
  # survival estimates.
  break.x.by = 1000, # break X axis in time intervals by 500.
  break.y.by = 0.1,
  ggtheme = theme_RTCGA(), # customize plot and risk table with a theme.
  risk.table.y.text.col = T, # colour risk table text annotations.
  risk.table.y.text = FALSE # show bars instead of names in text annotations
  # in legend of risk table
)
dev.off()

The Survival plot I got looks like this Suvival plot with my analysis. Basically I used the same data which they used in Human Protein Atlas database. But the plot with my analysis look different compared to the plot in the database.

What could be the reason for this? Kaplan Meier statistics?

Any help is appreciated.

ADD COMMENTlink modified 8 months ago by Friederike3.6k • written 8 months ago by Biologist150
0
gravatar for Friederike
8 months ago by
Friederike3.6k
United States
Friederike3.6k wrote:

I have nothing of substance to contribute except that the actual details of the analysis matter since the Human Protein Atlas people themselves show that the same data can very well yield differently looking survival plots: https://www.proteinatlas.org/ENSG00000119927-GPAM/pathology/tissue/liver+cancer

Overall, the trend seems to be the same for your analysis and theirs, no? Do you know whether you used the same tools, settings and cut-offs as the HPA guys?

ADD COMMENTlink written 8 months ago by Friederike3.6k

Yes, the trend looks same but in my plot I see after 2000 days there is down peak of high expression which I didn't observe in plot in HPA. I have used the same cutoff 23.6 which they have used. Don't know what is that small difference.

ADD REPLYlink written 8 months ago by Biologist150

you have one sample less (247 instead of 248 for one group). Also: did you remove everything FPKM < 1?

ADD REPLYlink written 8 months ago by Friederike3.6k

Yes, I see that in my case I have one sample less. I guess it won't make much difference. In their analysis they removed Genes with FPKM < 1, In my case I'm looking at only single gene.

ADD REPLYlink modified 8 months ago • written 8 months ago by Biologist150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1903 users visited in the last hour