DESeq2 dispersion estimates
0
0
Entering edit mode
12 months ago
rfox.phd • 0

I just want to confirm the behavior the dispersion estimate in DESeq2. I simulated some data with a true dispersion of 0 using the following:

require(DESeq2)
ngenes <- 5000
reps <- 3
freq.cv <- 2
sdlog <- sqrt( log( freq.cv^2+1 ) )
frequencies <- rlnorm(ngenes, sdlog=sdlog)
frequencies <- frequencies/sum(frequencies)
A <- NULL
for (j in 1:(2*reps)) {
A <- cbind(A, rpois(ngenes, mu))
}
colnames(A) <- c(paste('untreated.', 1:reps, sep=''), paste('treated.', 1:reps, sep=''))
colData <- data.frame(c(rep('untreated',reps), rep('treated',reps)))
rownames(colData) <- names(A)
names(colData) <- 'condition'
cds <- DESeqDataSetFromMatrix(A, colData, ~condition)
cds <- estimateSizeFactors(cds)
cds <- DESeq(cds, fitType='parametric')
plotDispEsts(cds, CV=TRUE)
x <- 1:10000
y <- 1/sqrt(x)
lines(x,y,col='green',lwd=2)


And the CV plot is

It looks like the estimated dispersion pretty much tracks the Poisson variance. I would expect the variance of the negative binomial itself to be equal to the Poisson variance when the true dispersion is 0. Does this mean the overall estimated variance of the negative binomial is twice that of the Poisson variance when the true dispersion 0? I recognize this won't add much to the variance at high counts but seems like at low counts it would overestimate the variance by a factor of two. Is this correct?

Many thanks for any clarification here.

deseq2 • 383 views