Question

dispersion estimation from DESeq and edgerR

0

Entering edit mode

10.1 years ago

tonja.r ▴ 600

I am using DESeq and edgerR normalization for my data. I guess I am missing something regarding the dispersion and normalized data.

The estimated dispersion and fitted line can be seen on plotDispEsts in DESseq and plotMeanVar in edgeR. The fitted lines is the variance vs mean of the normalized counts that account for the dispersion, right?

So, how can I obtain normalized counts that account for the dispersion?

From DESeq can be extracted as normalized_counts=counts(cds,normalized=TRUE) but when I plot log2(variance) vs log2(mean) (variance and mean for each gene) I do not get the fitted line, I get almost the same results as with raw counts. So, it did not account for the dispersion.

After applying following in edgeR, I could extract the normalized counts using $pseudo.counts but the plot log2(variance) vs log2(mean) for pseudo and normal counts is almost the same. plotMeanVar however shows the the raw variances of the counts (grey dots), the variances using the tagwise dispersions (light blue dots). Somehow, I could not reproduce this plot my plotting just for log2(variance) vs log2(mean) for pseudo and normal counts.

cds =DGEList(histone_m,group=cond)
y <- calcNormFactors(cds)
y=estimateCommonDisp(y)
y=estimateTagwiseDisp(y)
y$pseudo.counts

Either I do not understand the step with the estimation of the dispersion or how can I extract normalized counts that account for the fitter dispersion?

R • 3.0k views

ADD COMMENT • link updated 3.2 years ago by Ram 45k • written 10.1 years ago by tonja.r ▴ 600