Rank Chip-Seq Peaks Based On Motif Occurrence?
2
0
Entering edit mode
8.1 years ago

Hi everybody,

I post today after long time without finding the proper answer to my question.

More or less the workflow for analyzing TF ChIPseq data is becoming clear:

ChIP --> Library preparation --> Sequencing --> Pre-filtering and QC check --> Mapping to reference genome --> Find enrichment signals (peaks) --> Motif Discovery to cross validate if the top ranked peaks present the TF motif

But I'm interested into rank all my peaks into some sort of indicator whether they have a presence or not of the desired TF and subset the peak list to the TF real ones. I have been trying with MEME/MEME-chip/TOMTOM/MAST and also with HOMER, but I'm getting confused because with these suites I'm not particularly sure I could obtain what I want (or maybe I missunderstood something)

So, my question is: Is there any way to rank the peaks according to the presence or not of the TF, just to add it to the peak calling statistics?

Thanks for your help and suggestions!

chip-seq motif peak-calling • 3.6k views
ADD COMMENT
3
Entering edit mode
8.1 years ago

You can run MEME or MEME-chip on the top 600 sequences (ChIP-seq peak scores) and then run MAST with the obtained motif on the whole data set of peaks. So you will get the list of potential sites for your motif in the sequences.

Otherwise, you could use RSAT (http://rsat.ulb.ac.be/rsat/) and its peak-motifs analysis that will look for over-represented motifs directly using the whole data set of peaks and give you the positions of the instance of the motifs in the sequences with other information.

ADD COMMENT
0
Entering edit mode

I like your solution using MAST. Thanks for your suggestion, I will try it.

ADD REPLY
1
Entering edit mode
8.1 years ago

This can be a complicated question based on the particular TF you are looking at. Many TFs, even those with strong motifs have shown binding to segments that do not contain the motif. That being said, once you have a motif that you are interested in, there are a few programs that will find all occurances of that motif genome wide. Cladimo does this and I believe that fimo does this as well. Then you can simply intersect the motif occurances with you peaks. Another thing to keep in mind is that depending on the size of your motif, you may have multiple motif occurances within a peak.

Having a motif in your peak is reassuring, but if you are trying to figure out whether your peaks are true signal or noise, you probably want to look at as many sources of information as you can. Overlap with DHS or peaks for other co-factors work well for this if they are available.

ADD COMMENT
0
Entering edit mode

Interesting point to consider!

ADD REPLY

Login before adding your answer.

Traffic: 2716 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6