Question: Distribution of somatic mutation
gravatar for CY
2.7 years ago by
United States
CY420 wrote:

So... I got two questions.

1) Does somatic mutation follows poisson like germline mutation does?

2) We now use bayesian model to detect germline mutation. Does the mutation have to follow specific poisson distribution in order to use bayesian model? If it does not, can we use bayesian model for somatic mutation calling? say we sequencing 10000 tumors and use the mutation frequency on specific position as the prior probability of that position.

Can someone share some idea on this? really appreciate!

snp next-gen • 883 views
ADD COMMENTlink modified 2.7 years ago by solo777370 • written 2.7 years ago by CY420
gravatar for solo7773
2.7 years ago by
solo777370 wrote:

Though I may not know the right answer for you, I'd like to post my opinion as an answer.

1) According to my analysis of all somatic mutations of the latest release of ICGC data (landscape of somatic mutations, raw data as supplementary provided), poisson distribution doesn't apply to somatic mutation. Based on the number of mutations in the sample, it is likely to follow normal or Weibull distribution. Indeed, it is hard to say which distribution it is exactly related to.

Let's have a look at all somatic mutations of WGS of ICGC

Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  0    1506    3668   12450    8816  722400

Choose the data between 1st quantile and 3rd quantile to find the distribution it fits. The subset of data looks like below (x axis indicates sequencial order of individual)

distri across individual

After fitting distribution, the evaluation is as following,


2) Based on my understanding of Bayes, your data doesn't need to follow a specific distribution. Bayes depends on prior to predict posterior so you do need to know the prior. Nowadays, there are so many excellent tools for somatic variation calling. If you are not developing novel methods, you are encouraged to use existing tools. They are reliable and widely used in the community. What's more, the category of somatic mutations contains different types, eg. substitution, indel, structural variation, etc. Are you going to detect all types? It seems you are going to call somatic mutations based on statistics information. My understanding is that current tools map cancer sequences and reference genome.

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by solo777370
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1984 users visited in the last hour