Question

which normalized counts is better from DESeq2

5

Entering edit mode

10.1 years ago

weixiaokuan ▴ 140

Hi,

I may have a silly question. How should I get normalized counts from DESeq2? If I understand it right, rlog and vst are both normalized counts which one is better? or Should I just use "counts(dds,normalized=TRUE)" instead of rlog or vst. Thank you.

-X

RNA-Seq DESeq2 • 16k views

ADD COMMENT • link updated 2.6 years ago by Ram 45k • written 10.1 years ago by weixiaokuan ▴ 140

0

Entering edit mode

Define "better".

ADD REPLY • link 10.1 years ago by karl.stamm 4.1k

0

Entering edit mode

Good point. Actually, I should say advantages vs dis-advantages of each method.

Also, if I want to use these normalized data to calculate fold change which one is close/same to the one calculated by the "results" function of DESeq2?

Thank you.

ADD REPLY • link 10.1 years ago by weixiaokuan ▴ 140

Ram · Accepted Answer · 2015-07-03

16

Entering edit mode

10.1 years ago

Devon Ryan 105k

For things like calculating fold-changes, one would normally use the output of count(dds, normalized=T). The other two you mentioned and so much count but rather transformed counts. These are useful for things like making a heatmap, or PCA, or anything that involves clustering/imaging.

BTW, none of these will give you the same fold-change that DESeq2 does, because you're not using a prior distribution.

ADD COMMENT • link 10.1 years ago by Devon Ryan 105k

0

Entering edit mode

Devon,

Thank you for answering my questions. But I am just wondering why DESeq2 doesn't use rlog data matrix or vst data matrix to calculate the log2FC; but instead just using scaled (size factor) and normalized (dispersion) reads to calculate log2FC. I cannot find any reference for explaining such a practice. Do you have any insight on this?

-X

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by weixiaokuan ▴ 140

0

Entering edit mode

You have the question backwards. The question you should be asking is what would be gained by using heavily modified data for statistics rather than the raw data. Raw counts follow a negative binomial distribution, which is relatively easy to deal with. Truth be told, DESeq2 doesn't directly scale and normalize the counts, it just includes terms for them in its model (the scaling becomes a weight and dispersion is profiled out (I think edgeR does this too)). The details for this are in the DESeq2 paper.

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 10.0 years ago by Devon Ryan 105k

0

Entering edit mode

I am curious to know, while obtaining normalized count from count(dds, normalized=T), Does design provided from following command has an effect on normalized count?

dds <- DESeqDataSetFromMatrix(countData = x, colData = ss.edesign, design = ~Condition)

ADD REPLY • link updated 2.6 years ago by Ram 45k • written 6.7 years ago by Gabby ▴ 20