Question: Using the dispersion result from DESeq2 from one dataset to generate negative binomial sample
0
3.8 years ago by
hirak.sarkar20
hirak.sarkar20 wrote:

Hi,

I want to use DESeq2 dispersion results to model random variables that are distributed according to negative binomial. According to the DESeq2 paper, in page 2

Within-group variability, i.e., the variability between replicates, is modeled by the dispersion parameter αi, which describes the variance of counts via Var K_ij = μ_ij + α_i μ^2_ij.

Which suggests that the Negative Binomial parameterization, r = 1/α. Please let me know if my understanding about this is correct or not.

modified 3.8 years ago • written 3.8 years ago by hirak.sarkar20

Tagging: Michael Love

2
3.8 years ago by
hirak.sarkar20
hirak.sarkar20 wrote:

Okay, so I found the source of confusion, the reciprocal of r is called \alpha. So DESeq2 reports this \alpha. I guess zero dispersion suggests that variance and mean are same that is \mu = \sigma^2. So it is no longer a negative binomial distribution. A possible hack I can think of is putting a very small value to navigate divide by zero overflow. Please suggest if there is a better way.

The variance can then be written m + m2/r. Some authors prefer to set α = 1/r, and express the variance as m + α m2. In this context, and depending on the author, either the parameter r or its reciprocal α is referred to as the “dispersion parameter”, “shape parameter” or “clustering coefficient”,[5] or the “heterogeneity”[4] or “aggregation” parameter.[6] The term “aggregation” is particularly used in ecology when describing counts of individual organisms. Decrease of the aggregation parameter r towards zero corresponds to increasing aggregation of the organisms; increase of r towards infinity corresponds to absence of aggregation, as can be described by Poisson regression.