Question: Proving spesific TF binding locations have mutation enrichment than rest of the binding sites statistically
0
gravatar for morovatunc
2.4 years ago by
morovatunc400
Turkey
morovatunc400 wrote:

We are analyzing cancer patient mutation data. We defined set of region on the human genome as binding events, and would like to prove that some of the region have significantly higher number of mutations than others. To prove this, we decided to set a mutation number threshold to say, there is at least X number of mutations are required to say this region is a hotspot.

Parameters:

  1. We assume that mutation probability is constant within these regions.
  2. We have total number of 196 patients in this project.
  3. We have 4500 binding events (interested regions)
  4. Have total 960 mutation found in the proximity of the regions.
  5. 750bp is the median of the all binding events ( for our discussion lets assume they are all 750 bp)
  6. We have total number of 196 patients in this project

Attempt: To solve this problem, I thought implementing "Normal Approximation to binomial distribution" could be useful.

Questions:

1. I will test each binding region mutation number, X, with Uo. As a final procedure, I will do multiple hypothesis testing with respect to number of total binding regions. Is this correct?

Referring to this question, the OP asked a very similar question. But the OP is mainly focused on patient wise which I don't think it is not relevant in my case. Therefore:

2. Could implementing a Poisson distribution be more accurate?

I am very confused so any guidance will be very helpful.

chip-seq stat • 655 views
ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by morovatunc400
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1399 users visited in the last hour