in relation to a mail from January this yearin the r-help community, I followed Simon's advice to do my analyses in DESeq2 instead of DESeq.
I am working on an RNASeq from c. elegans. I have mapped the data with the ensembl genome build WBcel215. I have ran tophat2 to map and featureCounts to counts the reads (both with the defaults parameters).
I have two conditions, control and a knock-out with each three replica. Now I am trying to find differentially regulated genes between the two conditions using DESeq2.
This is the script I am using to read my raw count table into DESeq2:
featureCountTable <- read.table("featureCountTable_RawCounts.txt", sep="\t", quote=F)
colData <- data.frame(row.names=names(featureCountTable), condition = c(rep("wt",3), rep("cpb3", 3)))
cds <- DESeqDataSetFromMatrix (
countData = featureCountTable,
colData = colData,
design = ~ condition
fit = DESeq(cds)
res = results(fit)
But I am getting the same problem with DESeq2 as I have got with DESeq. When I ran the DESeq command I get a warning:
1: In log(ifelse(y == 0, 1, y/mu)) : NaNs produced
2: step size truncated due to divergence
So again I have tried to change the fitType.
fit = DESeq(cds, fitType="local")
Which than came back without any warnings.
Apparently this time both fitTypes are almost similar (at least to my inexperienced eyes.)
I add both dispersion Plots. The red line goes through the point-cloud in both cases (as Simon defined a good fit in the last communication, I wish it would have bin so easy :-) .
In the local fit type there a more outliers and the right end of the slope is going up again. I am not sure whether or not this is a good thing or not.
So, my question is - which of the two options is better?
I understand, that in general the parametric (default) option is better, but here it gives me a warning, so that something in the fit calculations is not good.
How can I understand theses plots?
Thanks for the help,
I also tried to post it on the bioc help site, but got no reposnes, so I try it here.