need some help on how to use DESeq2 for TCGA data
Entering edit mode
6 weeks ago
Leo • 0

Hello, I am sorry for this newbie question, but I spent all morning trying to find it out but can't find a clear answer anywhere.

I want to normalise RNA seq data from TCGA using DESeq2. I use the TCGA-Assembler R package to download RNA seq data using array platform: "gene_RNAseq" which gives an excel with raw_count and scaled_estimate for each patient sample and gene. Should I use the "ProcessRNASeqData" function from the TCGA-Assembler package or just go with the excel file given by the "DownloadRNASeqData" function?

Then before I can start using DESeq2, I have to create a count matrix. How do I do this? Does anyone perhaps have any code ready that I can use for this type of data?

I really would appreciate any help because I have no idea how to continue. Thanks!

edit: read somewhere that I should download "illuminahiseq_rnaseqv2-RSEM_genes_normalized (MD5)" from firehose. how can I convert this data so I can use it for DESeq2?

DESeq2 TCGA R TCGA-Assembler • 384 views
Entering edit mode

In this repository, I provide R code to download bladder cancer data from TCGA (TCGA-BLCA) using the package TCGABiolinks and then used DESeq2 to analyze the data. If you want to give it a try, just replace "TCGA-BLCA" with your cancer of interest TCGA abbreviation.

Entering edit mode
6 weeks ago
Barry Digby ▴ 920

Here is a .Rmd file for downloading miRNA, mRNA expression from TCGA-PRAD and downstream DESeq2 analysis using TCGA-Biolinks, which should do what you need. Take the parts you need and substitute in your cancer of interest at the biolinks step.

A cautionary tale: you will be missing ~10% of the ~60,000 genes (the same happens with firehose)..


Login before adding your answer.

Traffic: 1740 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6