Inconsistent survival times in TCGA clinicalMatrix file?
1
0
Entering edit mode
15 months ago

While trying to anaylze survival of glioblastoma patients, I came up with the following data from downloaded GBM_clinicalMatrix file:

sampleID           CDE_survival_time    CDE_vital_status    days_to_last_followup
TCGA.06.5859.01    138                  LIVING              139
TCGA.27.1831.01    504                  DECEASED            505


I wondered if in my survival analysis I should use139 instead of 138 and 504 instead of 505. I have no idea how it is possible that when a person is deceased the followup is one day after death and when a person is alive last followup doesn't "update" survival time? Am I wrong or has there been a mistake (or two) in downloaded data??

survival analysis TCGA data • 436 views
2
Entering edit mode
15 months ago

From where did you obtain this data? I just looked at the data on the GDC Legacy Data Portal and there is no discrepancy:

bcr_patient_barcode vital_status    last_contact_days_to    death_days_to
bcr_patient_barcode vital_status    days_to_last_followup   days_to_death
CDE_ID:2673794      CDE_ID:5        CDE_ID:3008273          CDE_ID:3165475
TCGA-06-5859        Alive           139                     [Not Applicable]


From what I understand from using the TCGA data since ~2014, they do not calculate an actual 'survival time' in the main data, which leads me to believe that you are using some third-party re-processed data that seems to be erroneous, but I could be wrong, of course.

Kevin

0
Entering edit mode

Firstly, I wish to thank you indeed very much for the effort you put into answeing this question. I downloaded the data from Xena browser

The problem was due to the wrong column:

 CDE_survival_time


I chose, instead of the correct one:

days_to_death


Tanks again

0
Entering edit mode

Sure thing

Traffic: 1732 users visited in the last hour
FAQ
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.