Question: Can We Use Rnaseq Fpkm Values Derived From 18 Cellines And Run Gsea Analysis On It?
2
gravatar for Hmm
5.7 years ago by
Hmm490
Hmm490 wrote:

I have 2 questions:

1) I have FPKM values for each cell line. How do i further normalize the results so that i can compare across cell lines. Do i do a zscore/robust zscore for each celline? 2) Has anyone ran GSEA software from the broad institute n RNAseq data and if yes, then is the input to GSEA the FPKM values?

rnaseq fpkm • 3.6k views
ADD COMMENTlink modified 5.7 years ago by Damian Kao15k • written 5.7 years ago by Hmm490
6
gravatar for Damian Kao
5.7 years ago by
Damian Kao15k
USA
Damian Kao15k wrote:

To answer your first question, FPKM is already normalized in the sense that you've already divided by the total library size (and also by transcript length). Whether that's a good way of normalizing your reads is questionable. If you are comparing among cell lines, you really don't need to divide by transcript length as it is a technical bias that should be consistent in all your samples.

Other popular options would be to normalize your reads with DESeq's method or EdgeR's TMM method. Here is a good paper that describes several normalization methods: http://bib.oxfordjournals.org/content/early/2012/09/15/bib.bbs046.full

Converting your normalized expression value to z-socre can be useful if you want to generate a nice heatmap or perform cluster analysis. However, you will lose information on the magnitude of gene expression with z-score. A gene going from 100 reads to 500 reads will have the same z-score as a gene going from 1000 reads to 5000 reads. Another option is to use variance stabilization method from the DESeq package.

ADD COMMENTlink written 5.7 years ago by Damian Kao15k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1062 users visited in the last hour