Question

The Cancer Genome Atlas (TCGA) gene expression analysis

0

Entering edit mode

6.0 years ago

lovely.molbio ▴ 10

Hi, I am trying to analyze TCGA breast cancer patients data. Based on a gene expression (high and low) and survival, I have divided all breast cancer patients into two groups. I have sorted patients/sample ID into two groups. Now, I want to analyze gene expression of whole set (high expression sample vs low expression samples) by creating two groups and analyzing DEG. Do i need to download the data from TCGA to perform this analysis? If yes, how can i do this? Thanks a lot in advance.

TCGA Gene expression • 2.4k views

ADD COMMENT • link 6.0 years ago by lovely.molbio ▴ 10

0

Entering edit mode

...but you indicate that you have already obtained the data, or am I incorrect? What exactly do you currently have, and what more do you need?

ADD REPLY • link 6.0 years ago by Kevin Blighe 89k

0

Entering edit mode

Thank you for your response. No, I have not downloaded any data from TCGA, yet. I have used Broad Firehose and some other tools to classify the data. Now I would like to perform a detailed gene expression analysis among two groups. There is normalized data available on Broad firehose portal, but, as much as I could understand this Broad Firehose data is not compatible with R based DEG analysis. So, my question is how should I proceed now from here?

ADD REPLY • link 6.0 years ago by lovely.molbio ▴ 10

1

Entering edit mode

I have not used the data from Broad Firehose; however, UCSC's Xena Browser has expression data available in HTseq and FPKM counts: https://xenabrowser.net/datapages/ (look for the 'GDC TCGA' datasets)

You will want those HTseq counts, however, you will have to convert them back to raw integer counts, as I show here: A: Normalisation of RNAseq data from UCSC Xena Browser

After that, the counts will be okay to input to DESeq2, EdgeR, or the limma / voom pipeline for processing.