how can I select WT and Mutated samples
1
0
Entering edit mode
4.7 years ago
Learner ▴ 250

I am trying to download Kidney Renal Rapillary Cell Carcinoma from GDC. I want to know how to extract WT samples (identify which samples are WT) and which samples are mutated samples ? how can I get only those samples ?

genome RNA-Seq • 1.6k views
0
Entering edit mode
4.7 years ago

There are different ways of determining this:

# TCGA barcode

If you have the TCGA barcode, this is by far the easiest way. Look at the 'Sample" field

# Other IDs

If you don't happen to have the TCGA barcode, then it's most likely a UUID, Case ID, or just a file-name that may have some ID in it's name. In these situations, you can search for these manually at the GDC Data Portal in the search box and then follow links in order to see if it's tumour or normal.

For example, if I have:

• UUID 0b0e0b62-b823-4fdb-b37b-4a2731e648a7, this relates to primary tumour
• Filename 3ba5d6ec-dcce-49bb-82e5-85d3903a2aa1.htseq.counts.gz, this relates to UUID c247b168-3b4b-40ae-8e1a-32dda1b34397 and is a normal sample

## ------------------------

There are other automated ways of doing this but the ones that I tried appeared to be outdated when I recently used them (open to being corrected if wrong, though).

Kevin

0
Entering edit mode

@Kevin Blighe I would like to know if I get the exon for the normal and tumor samples. what are the posibilities to check for differences? for example, do you know a way to check for mutation? or checking the effect of specific genes across two conditions? would you use the gene expression or the FIRMA ? and how do you deal with it?

0
Entering edit mode

I would download the raw count htseq files and then re-analyse them. Are you familiar with RNA-seq methodologies? All that you would need is DESeq2 for normalisation and differential expression.

For mutations, only MAF files are available in the TCGA open access data.

You can also check cBioPortal, which may already have all information for your gene of interest.

0
Entering edit mode

@Kevin Blighe why would you try to download the raw files? the problem is that the files are controlled so I cannot download them. how can I use this cBioProtal ? do they have all the data that exist in TCGA? In general I am trying to do the following I want to look at the Wild type samples for specific gene to see if there is a change between the wild type and the mutated/deleted samples (this can be done with DESeq2 ) - Also I want to know if I can drive any biological processes link to a specific gene (where they are unregulated in wild type samples)

I appreciate your help Thanks