RNAseq datasets, TPM/FPKM/DE analysis questions
Entering edit mode
15 months ago

Hi Folks,

I'm looking a) strategic advise and b) for public RNAseq datasets to look at differential gene expression between cancer types/controls.

It seems like most GEO sets or cBioportal sets don't provide raw counts, which is what I'd need to use in edgeR. My general "battleplan" was to analyze Dataset A, get a gene list and then analyze Dataset B, get a gene list. Compare lists, focus on overlapping genes (assuming there is overlap) and investigate additional genes obtained from each analysis.
I did get hold of raw counts for the TCGA datasets and was able to get a set of DE genes out of those (Dataset A) but now got stuck on Dataset B.

For a), assuming that I can only get TPM/FPKM or z-score data, does it make sense to intersect my obtained list and just confirm that those have high TPM/FPKM z-score values in other cancer datasets but not in controls? Or if a dataset has cancer and control samples, would it be feasible to calculate the average logFC and use that value to compare gene lists?

b) Is there any database that provides raw counts for RNAseq cancer studies?

Thanks for any pointers! Alex

RNA-Seq • 855 views
Entering edit mode

Have you seen https://amp.pharm.mssm.edu/biojupies/ ? It uses Kallisto to generate DE analysis from any GEO entry in minutes

Entering edit mode

That looks promising! I'll look into it, thank you!


Login before adding your answer.

Traffic: 2245 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6