**330**wrote:

Hi, Check my understanding towards how Poisson distribution is employed when finding peaks of CHIP-seq and CLIP-seq. It is well known that the number of times for a base is sequenced follows a Poisson distribution. Just like people going to supermarket using a particular entrance in a given period of time. Poisson distribution can plotted as following:

Here the average is 7 (lambda=7) plotted in red line. And green line denotes the edge of probabiliy is 0.975.

From the genome wide scale, *coverage* = *Read Length (nt)* * *Total Reads Number * / *Genome Length (nt)*. Indicating the average number of reads that hit a base. So assign the *coverage* as the lambda (or mean) of Poisson distribution, let's say also 7 here. x-axis means the reads number for a base, and y-axis means the probability of a given reads number.

Thus, we can know when the probability is 0.975, the reads number is <= 13. If the reads number detected in real CHIP-seq/CLIP-seq is larger than 14, we will know it is almost impossible, as long as the reads follow poisson distribution. However, in the real experiment, we detect for a position, the reads number is, let's say 20. Thus, how to explain this result? It is because here is the enrichment induced by chromatin binding protein (in CHIP-seq). Those reads are not random distributed. Thus, it can get a p-value for reads number = 20 by calling ** ppois(20-1,lambda=7,low.tail=FALSE)** in R.

Am I right? 'Cause I don't want to get wrong understanding.