Question: False Positive Peaks In Chip-Seq - How To Locate Them And How Do They Arise?
gravatar for Tky
7.0 years ago by
Tky1000 wrote:

I am reading a paper, in which ChIP-Seq was used to identify the binding sites of a target TF in mice. Before de novo motif discovery, the authors filter out peaks that were located in false-positive regions. (The authors claim that these false positive peaks emerged from low complexity sequence).

Since in this study, the ChIP-Seq on the target TF knock out mice was included, I guess the authors do the peak calling on the knock out strain with the input control, And all peaks from KO mice should be false positive. And it seems these regions are quite consistent in other ChIP seq experiments.

Thus I am wondering in general Chip-Seq experiment without the target TF knock-out, how should be removed the false positive peaks? and how do these false positive peaks arise, is it because the low complexity region has multi copies in the genome? but if that is the case, the false positive peaks could be largely elimated by using control non-ChIP DNA.

Thank you very much for sharing your ideas in advance,

chip-seq • 3.5k views
ADD COMMENTlink modified 7.0 years ago by Manu Prestat4.0k • written 7.0 years ago by Tky1000
gravatar for Chris Whelan
7.0 years ago by
Chris Whelan550
Portland, OR
Chris Whelan550 wrote:

You might want to check out this paper by Pickrell et al, "False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions":

They identified regions that are consistently called as peaks due to consistently getting high coverage, which they think is due to misassembled regions in the reference. They provide a set of bed files you can use to filter out peaks in those regions.

In my experience using a non-chip control usually gets rid of those peaks, but not always, so it's a good idea to check your peaks against those regions.

ADD COMMENTlink written 7.0 years ago by Chris Whelan550

Hi, Chris´╝îsorry for my late response. Thanks for pointing out the the paper, it is very helpful to understand the issue. However I am looking at the chip-seq data from mice, do you aware similar resources for mouse genome?

ADD REPLYlink written 7.0 years ago by Tky1000

Sorry, I haven't seen anything similar for mice. Perhaps you could try a similar approach to what the authors did in that paper to publicly available mouse data? You could also screen for the presence of satellite or simple repeats since those are probably most likely to be present in those mis-assembled regions.

Anyway, as you say, using an input non-ChIP control gets rid of most of these types of false positives.

ADD REPLYlink written 7.0 years ago by Chris Whelan550
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1124 users visited in the last hour