Question

Can We Use Rnaseq Fpkm Values Derived From 18 Cellines And Run Gsea Analysis On It?

2

Entering edit mode

10.9 years ago

Hmm ▴ 500

I have 2 questions:

1) I have FPKM values for each cell line. How do i further normalize the results so that i can compare across cell lines. Do i do a zscore/robust zscore for each celline? 2) Has anyone ran GSEA software from the broad institute n RNAseq data and if yes, then is the input to GSEA the FPKM values?

rnaseq fpkm • 5.3k views

ADD COMMENT • link updated 10.9 years ago by Damian Kao 16k • written 10.9 years ago by Hmm ▴ 500

score 6 · Answer 1 · 2013-06-26

To answer your first question, FPKM is already normalized in the sense that you've already divided by the total library size (and also by transcript length). Whether that's a good way of normalizing your reads is questionable. If you are comparing among cell lines, you really don't need to divide by transcript length as it is a technical bias that should be consistent in all your samples.

Other popular options would be to normalize your reads with DESeq's method or EdgeR's TMM method. Here is a good paper that describes several normalization methods: http://bib.oxfordjournals.org/content/early/2012/09/15/bib.bbs046.full

Converting your normalized expression value to z-socre can be useful if you want to generate a nice heatmap or perform cluster analysis. However, you will lose information on the magnitude of gene expression with z-score. A gene going from 100 reads to 500 reads will have the same z-score as a gene going from 1000 reads to 5000 reads. Another option is to use variance stabilization method from the DESeq package.