DESeq2 dispersion estimates
0
0
Entering edit mode
12 months ago
rfox.phd • 0

I just want to confirm the behavior the dispersion estimate in DESeq2. I simulated some data with a true dispersion of 0 using the following:

require(DESeq2)
ngenes <- 5000
reps <- 3
freq.cv <- 2
sample.reads <- 1E6
sdlog <- sqrt( log( freq.cv^2+1 ) )
frequencies <- rlnorm(ngenes, sdlog=sdlog)
frequencies <- frequencies/sum(frequencies)
A <- NULL
mu <- frequencies * sample.reads
for (j in 1:(2*reps)) {
  A <- cbind(A, rpois(ngenes, mu))
}
colnames(A) <- c(paste('untreated.', 1:reps, sep=''), paste('treated.', 1:reps, sep=''))
colData <- data.frame(c(rep('untreated',reps), rep('treated',reps)))
rownames(colData) <- names(A)
names(colData) <- 'condition'
cds <- DESeqDataSetFromMatrix(A, colData, ~condition)
cds <- estimateSizeFactors(cds)
cds <- DESeq(cds, fitType='parametric')
plotDispEsts(cds, CV=TRUE)
x <- 1:10000
y <- 1/sqrt(x)
lines(x,y,col='green',lwd=2)

And the CV plot is

DESeq2-CV-Plot

It looks like the estimated dispersion pretty much tracks the Poisson variance. I would expect the variance of the negative binomial itself to be equal to the Poisson variance when the true dispersion is 0. Does this mean the overall estimated variance of the negative binomial is twice that of the Poisson variance when the true dispersion 0? I recognize this won't add much to the variance at high counts but seems like at low counts it would overestimate the variance by a factor of two. Is this correct?

Many thanks for any clarification here.

deseq2 • 383 views
ADD COMMENT

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6