TCGA- which files to download for analyzing differentially expressed miRNAs
0
0
Entering edit mode
3.6 years ago
ginny • 0

I have just started working on TCGA data, and I observed that the RNA-seq (HT-Seq counts) files also have the ENSEMBL gene ids for miRNAs, which means that the expression values of miRNA genes are also present in the RNA-seq files.(?)

So then why does TCGA have a separate miRNA quantification dataset (files ending with .mirbase.mirna.quantification)?

I am confused because I plan to find both the differentially expressed genes as well as miRNAs, and don't know which dataset to consider for DESeq2.

Please help! :(

RNA-Seq sequencing mirna tcga DESEQ2 • 1.0k views
ADD COMMENT
1
Entering edit mode

You need to download them separately.

The mirbase.mirna.quantification files are what you want for miRNA DE analysis. You will want to subset the HT-Seq counts too if they contain roughly 50,000 rows (harmonized data) to contain only coding genes ~20,000

ADD REPLY
0
Entering edit mode

Thank you so much! Any idea how can I filter out only the coding genes?

ADD REPLY
1
Entering edit mode

I have code here (https://github.com/BarryDigby/TCGA_Biolinks/blob/master/TCGA_Biolinks.Rmd) that does everything you want: download data, prepare metadata, filtering coding genes, differential expression analysis. It's a good basic template to start with.

It was conducted on TCGA PRAD. Install packages as required, change PRAD to your tissue type of interest and you are good to go.

ADD REPLY

Login before adding your answer.

Traffic: 2345 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6