TCGA patient death dates
Entering edit mode
11 months ago
loughrae ▴ 90

Hi everyone,

I've been looking at the TCGA clinical data, queried using TCGAbiolinks:

allproj <- getGDCprojects()
projs <- allproj[startsWith(allproj$id, 'TCGA'),]$id
clin <- lapply(projs, FUN = GDCquery_clinic, 'clinical')

I was surprised to see some patients with death dates before TCGA data collection started in ~2006. There are 7,510 patients reported Alive and 3,641 reported Dead. Of the 3,641 dead 2,705 have a year_of_death listed, and of those 31% have a year_of_death earlier than 2005, some as early as 1990.

Does anyone know what's going on here? Are these dates correct, and if so were samples not taken at diagnosis?

Also, the last recorded death date is 2014. Were patients followed at all after TCGA finished in 2014 or are some of the patients listed as Alive actually dead now? What timeframe does the vital status refer to?


cancer TCGA data clinical • 229 views

Login before adding your answer.

Traffic: 661 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6