Question: background for motif analysis
gravatar for mariamari693
3.6 years ago by
mariamari69310 wrote:

I have some problems regarding choosing background sequences for motif analysis let say I have 3 different conditions (DNase I data), I did clustering and I want to see the differences in motif enrichment among the clusters. Each cluster has different number of intervals but for motif analysis I made them all with the same length (200 bp) (summits). my questions are: (1) shall I use the total DNase I peaks (merged from all 3 conditions) as background or genomic regions? which one is better or that makes any difference? (2) in order to do motif search for each cluster, shall I use the same set of background sequences or I should justify it based on CG% and number of regions for each cluster separately?

sequencing chip-seq next-gen • 1.5k views
ADD COMMENTlink modified 3.6 years ago by ejm32440 • written 3.6 years ago by mariamari69310
gravatar for ejm32
3.6 years ago by
Boston, MA
ejm32440 wrote:

The short answer is try all three!!!

My thoughts on the matter:

  1. If you were to use the superset of DHS peaks you may not get any hits since there may not be enough enrichment from the smaller sets.
  2. If the different sets of peaks have wildly different sequence composition then the superset will not be a good background as it will not accurately capture the sequence composition.
  3. I would let your motif finding program choose the background sequences for you. And then I would repeat the analysis with the same set of background sequences.

Good luck!

ADD COMMENTlink written 3.6 years ago by ejm32440
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 847 users visited in the last hour