Hi all, I have 250 samples from healthy and disease states. I want to integrate gene expression data into metabolic model and do flux balance analysis. Can I use FPKM directly for this work or should I normalize FPKM? For example in some publications I see that some researchers used quantile normalization of FPKM.
Any help is welcome
This question has been asked in bioconductor support too. Here is the link
Who? Data on FPKM scale is already normalised, but not for cross-sample differences.
For example see this paper : https://www.pnas.org/content/115/50/E11874.short At the supplementary data, FPKM quantile normalization has been explained.
I'm not familiar with metabolic modeling, but it's not because it has been published before that it is correct.
Indeed, you may want to check on CrossValidated StackExchange about the feasibility of performing quantile normalisation on FPKM data. It does not feel right to me. A better transformation would be to Z-scores, via zFPKM package in R.
Maybe check quantro, a framework to test if your dataset fulfills the assumptions of quantile normlization.