Comparing Peaks Coming From Different Peak Calling Programs
1
2
Entering edit mode
11.0 years ago
KCC ★ 4.1k

I have a few concerns when comparing peaks between different programs or between different replicates with the same program:

  1. Peak Width (How do I handle the issue that one program might call a region as one peak and another might call the same region as two peaks)? This makes it really difficult to say program A called 10,000 peaks but program B called 9000 and have those numbers have any meaning.

  2. How do I define two peaks as being the same peak (so I can say two replicates called the same peak for instance)? How much should they overlap? Or should they be called the same if they are within a certain distance of each other? How do I define that distance?

  3. What is the most typical and accepted way to measure how significantly two sets of peaks overlap? I know that I could use GSC, the Bioconductor package Co-occur or some version of hypergeometric test. I was hoping for some sense from the community of how typical it is to use any of these approaches. What do most people do?

peak-calling peak-calling • 2.7k views
ADD COMMENT
2
Entering edit mode
11.0 years ago
Stephen 2.8k

See ENCODE's Irreproducible Detection Rate and the ENCODE papers that describe it. Via @nextgenseek.

ADD COMMENT

Login before adding your answer.

Traffic: 1834 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6