I am a Biologist trying to understand the statistics of RNA-Seq data.
Given that RNA-Seq follows NB distribution with Biological replicates, as NBD accounts for overdispersion in the data, I am not sure how to ascertain it to my data.
Although I understanstood these distributions through standard books I am unable to comprehend and relate it to RNA-Seq.
Differential expression is my aim.
I have simulated data, to test and understand some open source tools like edgeR, DESeq, Cufflinks etc.
I have real data set too.
I have two conditions with four replicates each.
If I have to know whether my data fits NBD or Poisson distribution, I have to check this across replicates of each gene of each condition??
If the above point is right, how do I do it?
Should I do some goodness of fit test like Chi-sqare test or just the mean variance relationship is enough?
Thanks in advance for your valuable inputs.