Fitting distribution in R

0

Entering edit mode

4.7 years ago

ma23 ▴ 40

Hi everyone!

I have some data that describe gene expression of several people.

I want to understand whether the distribution of the data can be modeled as the Poisson or the Negative binomial distribution.

For the Poisson I use the next commands:

n <- length(x)
lambda = mean(x) # I use the MLE for the Poisson parameter
f.hyp = dpois(x,lambda)*n
chiSquare.pois = sum((f.obs-f.hyp)^2/f.hyp)

Am I right with this code ?

How can I estimate the parameters for the neg.binomial distribution and compare these two models (poisson and neg.binomial ) ?

R RNA-Seq gene expression Negative binomial • 2.0k views

ADD COMMENT • link 4.7 years ago by ma23 ▴ 40

1

Entering edit mode

I would probably start with the papers and source codes of the established tools that model RNA-seq as NB, such as DESeq2 and edgeR to get an impression on how/why they do it.

ADD REPLY • link 4.7 years ago by ATpoint 82k

0

Entering edit mode

As pointed out, check previous work to understand why people decided for a particular distribution for a given data type. Typically when trying to decide which distribution best approximates the data, visual tools (e.g. density and QQ plots) and goodness-of-fit tests are used (e.g. chi-squared test). For choosing between (families of) distributions, have a look at the R package fitdistrplus and its descdist() function.

ADD REPLY • link 4.7 years ago by Jean-Karim Heriche 27k

Login before adding your answer.