Question: TCGA data(RPKM) differential gene expression
gravatar for realnewbie
2.9 years ago by
realnewbie10 wrote:

Hi, Everyone, I am a real beginner in bioinformatics. I want to calculate differential expression profile of my TCGA data-mRNAseq(by using 5 normal, 5 tumor samples).However,to calculate this, I have RPKM values which requires non-parametric methods. Actually, I will upload this data to IPA(Ingenuity Pathway Analysis) tool to predict pathways,targets(for miRseq data from TCGA), upstream/downstream regulators. When I watch IPA tutorial, I realized that to predict all these from RNA-seq data, I need LogfoldChange value,dependent p-value and False DiscoveryRate. Unfortunately, I have not that much background how to deal with these calculations,how to calculate all of these from my RPKM valued TCGA data. Can anyone help me? Thanks a lot!

rpkm rna-seq ipa tcga R • 2.9k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by realnewbie10

Have you looked at cBioPortal:

If you are only interested in looking up data this would be a painless way to do that.

ADD REPLYlink written 2.9 years ago by genomax78k

IPA is a commercial software and you could contact their support on the input format required.

Anyway, a solution to getting DE analysis using TCGA is described here: How to work with Level 3 data (RPKM values) from TCGA database

In short, there is no accepted method to get DE-genes from RPKM, but it is possible to use the raw data.

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by Michael Dondrup47k

Thanks a lot for explanation! Actually, I've watched all the videos, the format requires data-sets with deferentially expression signs,I mean the values like Log Fold Change, p-value,FDR and so on. My problem here is how to calcuate all these stuffs. I am a newbie, and I dont think that I have a strong background n coding. As far as I read, there are some recommendations: deSeq2,EdgeR,NoIseq and so on. However, I've been lost the information provided by the users of these packages. I have few samples for normal and tumor samples(having raw counts,median-length normalized and RPKM values of mRNA-seq data from TCGA and raw counts,read per million miRNa mapped count miR-seq data). What I want is to calculate differential expression of these by using R-package codes.I could not write the proper code. Many many thanks

ADD REPLYlink modified 2.2 years ago • written 2.9 years ago by realnewbie10

I found this site very helpful,if you are interested in here it is:

ADD REPLYlink written 2.9 years ago by realnewbie10

But now, I have another problem ,IPA cannot recognize GeneCards IDs of FireBrowse dataset. In TCGA page, you cannot reach the normal samples, so you cannot compare normal versus cancer patients with such as primary tumor. And, unfortunately, there is no page to convert GeneCards IDs to Ensembl or another known database IDs. TCGA is a real challenge!

ADD REPLYlink written 2.9 years ago by realnewbie10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1105 users visited in the last hour