Dear All,

I am a Biologist trying to understand the statistics of RNA-Seq data.

Given that RNA-Seq follows NB distribution with Biological replicates, as NBD accounts for overdispersion in the data, I am not sure how to ascertain it to my data.

Although I understanstood these distributions through standard books I am unable to comprehend and relate it to RNA-Seq.

Differential expression is my aim.

I have simulated data, to test and understand some open source tools like edgeR, DESeq, Cufflinks etc.

I have real data set too.

I have two conditions with four replicates each.

If I have to know whether my data fits NBD or Poisson distribution, I have to check this across replicates of each gene of each condition??

If the above point is right, how do I do it?

Should I do some goodness of fit test like Chi-sqare test or just the mean variance relationship is enough?

Thanks in advance for your valuable inputs.

