0
5.2 years ago by
mangfu100720
Korea, Republic Of
mangfu100720 wrote:

Hi all.

I heard that most of CNV detection tools are based on read depth and they make Gaussian assumption about the distribution of read count ratio.

In this point of view, What is x-axis and y-axis meaning of Gaussian distribution? I could think y-axis as a read-depth count and X-axis as a position in exons? Is it right? then, each exon in exome sequencing follow Gaussian distribution?

With having above concept, I cannot connect concept above into sentence below. Could you look at it for advice?

Most of the existing tools for CNV calling that are based  on read depth, such as ExomeCNV and CNV-seq, make Gaussian assumptions about the distribution of read count ratio. In the absence of technical variability, the proportion of reads matching to a specific sample should follow a binomial distribution whose success rate is determined by genome-wide read count ratio between the test sample and reference set.

I understood concept above as just one sequence sample follows Gaussian distribution while sample-to-sample follows binomial distribution.

Is it right that what I am understanding?

sequencing alignment next-gen • 1.8k views
modified 5.1 years ago by Chris Miller21k • written 5.2 years ago by mangfu100720

Say we have a binomial distribution B(n,p). If n is large and both np and n(1-p) is not small, it can be approximated with a Gaussian distribution N(np,np(1-p)). See the wikipage of binomial distribution.

3
5.1 years ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

Choosing reads is actually a Poisson process, which, due to various technical and alignment biases, can be better represented with a negative binomial distribution (essentially an overdispersed Poisson). For a brief description of this and how it relates to read depth and CNV, see our writeup in this paper: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0016327#s2