Question: gene expression correlation, FPKM, WGCNA
gravatar for Reza
18 months ago by
Reza0 wrote:

Hi all, I have expression values in FPKM. Can I directly use average of log2(FPKM) as input values for WGCNA?


R rna-seq bioinformatics • 1.0k views
ADD COMMENTlink written 18 months ago by Reza0
gravatar for RamRS
18 months ago by
Houston, TX
RamRS25k wrote:

I googled "FPKM WGCNA". First link:

Text from first link:

Can WGCNA be used to analyze RNA-Seq data?

Yes. As far as WGCNA is concerned, working with (properly normalized) RNA-seq data isn't really any different from working with (properly normalized) microarray data.

We suggest removing features whose counts are consistently low (for example, removing all features that have a count of less than say 10 in more than 90% of the samples) because such low-expressed features tend to reflect noise and correlations based on counts that are mostly zero aren't really meaningful. The actual thresholds should be based on experimental design, sequencing depth and sample counts.

We then recommend a variance-stabilizing transformation. For example, package DESeq2 implements the function varianceStabilizingTransformation which we have found useful, but one could also start with normalized counts (or RPKM/FPKM data) and log-transform them using log2(x+1). For highly expressed features, the differences between full variance stabilization and a simple log transformation are small.

Whether one uses RPKM, FPKM, or simply normalized counts doesn't make a whole lot of difference for WGCNA analysis as long as all samples were processed the same way. These normalization methods make a big difference if one wants to compare expression of gene A to expression of gene B; but WGCNA calculates correlations for which gene-wise scaling factors make no difference. (Sample-wise scaling factors of course do, so samples do need to be normalized.)

If data come from different batches, we recommend to check for batch effects and, if needed, adjust for them. We use ComBat for batch effect removal but other methods should also work.

Finally, we usually check quantile scatterplots to make sure there are no systematic shifts between samples; if sample quantiles show correlations (which they usually do), quantile normalization can be used to remove this effect.

Relevant part:

but one could also start with normalized counts (or RPKM/FPKM data) and log-transform them using log2(x+1).

I know neither concept (FPKM/WGCNA). I think you should invest a little more effort into your questions before you expect others to put in any effort for you. Apologies if you're in Iran, this site is not accessible there. I'd recommend editing your profile ( ) and adding your location there so anyone responding to your questions knows you're working with restricted access to the Internet. Also, see if you could possibly mention your location in future posts.

ADD COMMENTlink modified 18 months ago • written 18 months ago by RamRS25k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1123 users visited in the last hour