Question

edgeR normalization and partial correlation matrix calculation

1

Entering edit mode

7.0 years ago

moxu ▴ 510

I am learning "ridge.net" to study gene-gene-interactoin (GGI). ridge.net can generate a partial correlation matrix (PCM) based on gene expression levels (RNA-seq gene counts), and the PCM can be used for building GGI. However, the raw or expected gene counts need to normalized to create a PCM for better results, I'd assume.

I have to admit that I fell in love with edgeR for RNA-seq analysis. It's really powerful. So I am wondering if the raw or expected gene counts can be normalized for the purpose of creating a PCM. A few questions regarding such the normalization:

Is cpm a good normalization for such a purpose? It seems CPM only changes the original gene count by scaling. How about the fancy Bayes shrinkage modification and dispersion estimate and such? Aren't these important as well?
should "logcpm <- cpm(y, prior.count=2, normalized.lib.sizes=TRUE, log=TRUE)" be called after
```
y <- DGEList(counts=d);
```
Or after
```
y <- calcNormFactors(y); # global normalization
```
Or even after
```
 y <- estimateDisp(y, dsgn);
```
is logcpm better than cpm for the purpose of getting PCM and later on to rig out GGI? I am inclined towards logcpm.

next-gen rna-seq R gene • 2.1k views

ADD COMMENT • link 7.0 years ago by moxu ▴ 510

0

Entering edit mode

logCPM might not be good because I got ERCC.00112 shown on a network with the strongest edges (branch factor 3~5).

Any insight?

ADD REPLY • link 6.9 years ago by moxu ▴ 510