Question: how can I select WT and Mutated samples
gravatar for Learner
3 months ago by
Learner 90
Learner 90 wrote:

I am trying to download Kidney Renal Rapillary Cell Carcinoma from GDC. I want to know how to extract WT samples (identify which samples are WT) and which samples are mutated samples ? how can I get only those samples ?

rna-seq genome • 231 views
ADD COMMENTlink modified 3 months ago by Kevin Blighe16k • written 3 months ago by Learner 90
gravatar for Kevin Blighe
3 months ago by
Kevin Blighe16k
University College London Cancer Institute
Kevin Blighe16k wrote:

There are different ways of determining this:

TCGA barcode

If you have the TCGA barcode, this is by far the easiest way. Look at the 'Sample" field



Other IDs

If you don't happen to have the TCGA barcode, then it's most likely a UUID, Case ID, or just a file-name that may have some ID in it's name. In these situations, you can search for these manually at the GDC Data Portal in the search box and then follow links in order to see if it's tumour or normal.

For example, if I have:

  • UUID 0b0e0b62-b823-4fdb-b37b-4a2731e648a7, this relates to primary tumour
  • Filename 3ba5d6ec-dcce-49bb-82e5-85d3903a2aa1.htseq.counts.gz, this relates to UUID c247b168-3b4b-40ae-8e1a-32dda1b34397 and is a normal sample


There are other automated ways of doing this but the ones that I tried appeared to be outdated when I recently used them (open to being corrected if wrong, though).


ADD COMMENTlink written 3 months ago by Kevin Blighe16k

@Kevin Blighe I would like to know if I get the exon for the normal and tumor samples. what are the posibilities to check for differences? for example, do you know a way to check for mutation? or checking the effect of specific genes across two conditions? would you use the gene expression or the FIRMA ? and how do you deal with it?

ADD REPLYlink written 3 months ago by Learner 90

I would download the raw count htseq files and then re-analyse them. Are you familiar with RNA-seq methodologies? All that you would need is DESeq2 for normalisation and differential expression.

For mutations, only MAF files are available in the TCGA open access data.

You can also check cBioPortal, which may already have all information for your gene of interest.

ADD REPLYlink written 3 months ago by Kevin Blighe16k

@Kevin Blighe why would you try to download the raw files? the problem is that the files are controlled so I cannot download them. how can I use this cBioProtal ? do they have all the data that exist in TCGA? In general I am trying to do the following I want to look at the Wild type samples for specific gene to see if there is a change between the wild type and the mutated/deleted samples (this can be done with DESeq2 ) - Also I want to know if I can drive any biological processes link to a specific gene (where they are unregulated in wild type samples)

I appreciate your help Thanks

ADD REPLYlink written 3 months ago by Learner 90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 754 users visited in the last hour