Question

estimation of size factors in DESeq2 analysis

4

Entering edit mode

11.0 years ago

Assa Yeroslaviz ★ 1.9k

Hi all,

in relation to a mail from January this yearin the r-help community, I followed Simon's advice to do my analyses in DESeq2 instead of DESeq.

I am working on an RNASeq from c. elegans. I have mapped the data with the ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts to counts the reads (both with the defaults parameters).

I have two conditions, control and a knock-out with each three replica. Now I am trying to find differentially regulated genes between the two conditions using DESeq2.

This is the script I am using to read my raw count table into DESeq2:

featureCountTable <- read.table("featureCountTable_RawCounts.txt", sep="\t", quote=F)
colData <- data.frame(row.names=names(featureCountTable), condition = c(rep("wt",3), rep("cpb3", 3)))
cds <- DESeqDataSetFromMatrix (
    countData =     featureCountTable,
    colData   =     colData,  
    design    = ~    condition
    )
fit = DESeq(cds)
res = results(fit)

But I am getting the same problem with DESeq2 as I have got with DESeq. When I ran the DESeq command I get a warning:

Warning messages:
1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
2: step size truncated due to divergence

So again I have tried to change the fitType.

fit = DESeq(cds, fitType="local")

Which than came back without any warnings.

Apparently this time both fitTypes are almost similar (at least to my inexperienced eyes.)

I add both dispersion Plots. The red line goes through the point-cloud in both cases (as Simon defined a good fit in the last communication, I wish it would have bin so easy :-) .

In the local fit type there a more outliers and the right end of the slope is going up again. I am not sure whether or not this is a good thing or not.

So, my question is - which of the two options is better?

I understand, that in general the parametric (default) option is better, but here it gives me a warning, so that something in the fit calculations is not good.

How can I understand theses plots?

Thanks for the help,

Assa

default/parametric fit DESeq2_parametric

local fit DESeq2 local fit

P.S.

I also tried to post it on the bioC help site, but got no responses, so I try it here.

DESeq2 R fitType • 8.0k views

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

When did you post this to the Bioconductor list? I can't seem to find it. The DESeq2 authors are usually very good at replying in a timely manner, so I'm surprised they didn't get back to you.

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Devon Ryan 105k

0

Entering edit mode

About a week ago I sent the mail to bioconductor@r-project.org. Is it still the correct address?

It seems that my question was also not posted in the daily Bioconductor digest, so I can't really say why is that.

Can it be that it has a problem with images?

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Assa Yeroslaviz ★ 1.9k

0

Entering edit mode

Perhaps it was the images, since it didn't go through. Give it another try.

In the dispersion plot, it looks like the slightly sigmoidal slope with an up-tick at the end is just not fit easily with the parametric equation. My guess is that Michael or Simon will advise you to go with the local fit, but of course it's always more definitive to get that answer from them.

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Devon Ryan 105k

Ram · Accepted Answer · 2014-07-02

3

Entering edit mode

11.0 years ago

Michael Love ★ 2.6k

This warning message can be ignored. It is coming from a call to R's glm() function in capturing the (dispersion ~ mean) trend. And the trend is fit iteratively until convergence, so though glm() complained at some step, it produced a final fit without error. If the parametric trend does not converge, local fit is substituted.

What version of DESeq2 are you using? I thought I had worked on more comprehensible warning reporting in this function.

I would go with the parametric to avoid the curve at the right side, although it shouldn't matter much.

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Michael Love ★ 2.6k

0

Entering edit mode

Thanks for the information. I am working with DESeq2_1.4.0.

If I understand your comment correctly, you means, that DESeq tried first a parametric fit and than automatically, if it doesn't work, it converts to the local fit?

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Assa Yeroslaviz ★ 1.9k

1

Entering edit mode

Yes, but in your case, the parametric fit does converge/work. The warning is from one step in the fitting process, but then the parametric fit converged in the end.

ADD REPLY • link updated 3.7 years ago by Ram 45k • written 11.0 years ago by Michael Love ★ 2.6k