background for motif analysis
1
2
Entering edit mode
7.5 years ago
mariamari693 ▴ 20

I have some problems regarding choosing background sequences for motif analysis let say I have 3 different conditions (DNase I data), I did clustering and I want to see the differences in motif enrichment among the clusters. Each cluster has different number of intervals but for motif analysis I made them all with the same length (200 bp) (summits). my questions are: (1) shall I use the total DNase I peaks (merged from all 3 conditions) as background or genomic regions? which one is better or that makes any difference? (2) in order to do motif search for each cluster, shall I use the same set of background sequences or I should justify it based on CG% and number of regions for each cluster separately?

next-gen ChIP-Seq sequencing • 2.5k views
ADD COMMENT
1
Entering edit mode
7.5 years ago
ejm32 ▴ 450

The short answer is try all three!!!

My thoughts on the matter:

  1. If you were to use the superset of DHS peaks you may not get any hits since there may not be enough enrichment from the smaller sets.
  2. If the different sets of peaks have wildly different sequence composition then the superset will not be a good background as it will not accurately capture the sequence composition.
  3. I would let your motif finding program choose the background sequences for you. And then I would repeat the analysis with the same set of background sequences.

Good luck!

ADD COMMENT

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6