Question: Fitting distribution in R
gravatar for ma23
14 months ago by
ma2340 wrote:

Hi everyone!

I have some data that describe gene expression of several people.

I want to understand whether the distribution of the data can be modeled as the Poisson or the Negative binomial distribution.

For the Poisson I use the next commands:

n <- length(x)
lambda = mean(x) # I use the MLE for the Poisson parameter
f.hyp = dpois(x,lambda)*n
chiSquare.pois = sum((f.obs-f.hyp)^2/f.hyp)

Am I right with this code ?

How can I estimate the parameters for the neg.binomial distribution and compare these two models (poisson and neg.binomial ) ?

ADD COMMENTlink modified 14 months ago • written 14 months ago by ma2340

I would probably start with the papers and source codes of the established tools that model RNA-seq as NB, such as DESeq2 and edgeR to get an impression on how/why they do it.

ADD REPLYlink written 14 months ago by ATpoint38k

As pointed out, check previous work to understand why people decided for a particular distribution for a given data type. Typically when trying to decide which distribution best approximates the data, visual tools (e.g. density and QQ plots) and goodness-of-fit tests are used (e.g. chi-squared test). For choosing between (families of) distributions, have a look at the R package fitdistrplus and its descdist() function.

ADD REPLYlink written 14 months ago by Jean-Karim Heriche23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 729 users visited in the last hour