Question: Where are the actual transcription factor binding sites located in the CHIP-seq peaks
1
gravatar for Saad Khan
2.6 years ago by
Saad Khan310
United States
Saad Khan310 wrote:

The peaks of a particular Transcription factor CHIP-seq as identified by multiple peak calling algorithms e.g. MACS2 peakzilla e.t.c identify peaks of varying sizes or regions from 50 bp to upto 1000 bp or more. In that case I was wondering that if the average size of transcription factor binding site is 10 bp based on available data in OReganno e.t.c (as indicated by this post https://www.biostars.org/p/64854/) would it be safe to assume that for a particular peak the TFBS location is in its center or not. If not then what would be a good way to go about doing it and if such a method already exist or if someone has already done it.

ADD COMMENTlink modified 2.6 years ago by jotan1.2k • written 2.6 years ago by Saad Khan310
2
gravatar for Sean Davis
2.6 years ago by
Sean Davis24k
National Institutes of Health, Bethesda, MD
Sean Davis24k wrote:

No, it would not be safe to assume that the peak is at the center.  If you know the motif for your protein, you can simply search for that motif in your ChIP-seq peaks.  If you do not know the motif, you can use any number of software packages that will look for enriched sequences in the many peaks that you have to try to define what the motif is.  In both cases, after you have the motif, then you "know" the location.

ADD COMMENTlink written 2.6 years ago by Sean Davis24k
1
gravatar for Alex Reynolds
2.6 years ago by
Alex Reynolds24k
Seattle, WA USA
Alex Reynolds24k wrote:

You could use FIMO in the MEME suite to scan for motif models (JASPAR, etc.) across your genome of interest, at a level of statistical significance that you deem acceptable (e.g., 1e-4, 1e-5).

Take the search result from FIMO and convert it to a UCSC BED-formatted file. This file contains the putative binding sites of all motifs from the motif models across the whole genome.

Then do set operations with BEDOPS tools (like bedmap) to precisely locate putative TF binding sites that overlap with — are contained entirely or partially within — your ChIP-seq peaks.

$ bedmap --echo --echo-map peaks.bed wg-motifs.bed > answer.bed

If you repeat your experiments with other motifs, you can reuse your whole-genome search result to apply the same set operations.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Alex Reynolds24k

What if there are more than one enriched motifs in the peak region because of peak region being big?

ADD REPLYlink written 2.6 years ago by Saad Khan310

The bedmap result will show all overlapping motifs per peak.

ADD REPLYlink written 2.6 years ago by Alex Reynolds24k
0
gravatar for jotan
2.6 years ago by
jotan1.2k
Australia
jotan1.2k wrote:

This is a lab-based, rather than bioinformatics based method but have you considered ChIP-exo?http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3813302/

I believe that if you have a very high-density standard ChIP-seq track, it's possible to mine this data to get the same type of information. In a deeply sequenced ChIP-seq track, it's sometimes possible to identify regions with truncated reads where the termination points mark out the TF binding region. This is also dependent on very good sonication of the samples.

ADD COMMENTlink written 2.6 years ago by jotan1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 962 users visited in the last hour