Question: Getting subtypes of cancer from TCGA
gravatar for noorpratap.singh
4.6 years ago by
University of Maryland
noorpratap.singh300 wrote:

I am working with RNA-Seq data of renal cancer.I downloaded the data from GDC portal. Apparently I wanted to know the subtypes associated with each sample type. I scanned through the clinical meta data, biospecimen meta data even through some individual XML files but I did not find any header corresponding to subtype. is down otherwise perhaps I could have subtypes from here. Also I tried using using R package TCGA biolinks to get the subtypes but problem is that total number of samples retrieved are way less from TCGAbiolinks compared to what I manually download from GDC portal(Talking about cases). So any help from where I can retrieve subtype of each sample. Thanks in advance,

In addition has the TCGA data website been permanently shutdown because as such GDC data portal seems more intensive with more samples and whether it has been replaced by it.

rna-seq gdc tcga • 2.6k views
ADD COMMENTlink modified 4.6 years ago by Ming Tang2.6k • written 4.6 years ago by noorpratap.singh300
gravatar for aditi.qamra
4.6 years ago by
aditi.qamra270 wrote:


You can use the R package of cBioportal for extracting the clinical data ( The clinical data will also have the subtype information. You can follow the instuctions listed on the webpage.

ADD COMMENTlink written 4.6 years ago by aditi.qamra270

Thanks but again from above link also as I mentioned the samples are less compared to GDC portal. cBioPortal contains data from old TCGA data portal and has not been updated perhaps.

ADD REPLYlink written 4.6 years ago by noorpratap.singh300
gravatar for Mike
4.6 years ago by
Mike1.7k wrote:

Also you can try RTCGAToolbox and TCGA-Assembler , but Im not sure they are updated or not.

And make sure that Clinical data contains only cancers/tumor sample information, whereas expression data contains both types of samples (normal/cancer). So total numer of samples in cilinical data are lesser. so count both types of samples and find where is different and how much different.

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by Mike1.7k

Thanks @Mike but the toolbox is out of service. As far as the clinical and expression data is concerned I am aware of the fact that expression profile contains both normal and tumor data from same patient. Surprisingly I was looking at broad institute data and somehow the data which they provide also contains the exact number of samples as gdc portal. But there how I can find subtype information.

ADD REPLYlink written 4.6 years ago by noorpratap.singh300
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1663 users visited in the last hour