Question: Ideas to split peaks into subpeaks - beyond PeakSplitter
gravatar for Sergio Martínez Cuesta
2.8 years ago by
Cambridge, UK
Sergio Martínez Cuesta170 wrote:

Here are some thoughts regarding how to break down bed files into smaller (perhaps more concrete) bed files using signal / read counts from e.g. wig or bedgraph files.

One possible scenario would be: you have identified peaks (treatment vs. control) using callers such as macs2, however after having a look at the signal in igv, you would like to further split your resulting peaks bed file into subpeaks e.g. it looks as if some identified peaks actually contain more than one peak.

    *               *
    **              **
   *****     *     ****
  *******   ***   ****** 

We have had some mixed experiences using PeakSplitter.jar to perform this so I am putting forward some possible alternatives:

(1) Split your original peak bed files in sliding windows using bedtools makewindows

    *               *
    **              **
   *****     *     ****
  *******   ***   ****** 

Then count reads in the sliding windows using bedtools coverage. Finally, perform differential analysis on the counts (treatment vs. control) obtained from the sliding windows using limma, edgeR or DESeq2.

(2) Go back to the peak calling step e.g. macs2 callpeak but now using the option --call-summits

macs2 callpeak -h

--call-summits   If set, MACS will use a more sophisticated signal
                 processing approach to find subpeak summits in each
                 enriched peak region. DEFAULT: False

Once you have the summits (1bp), extend them left and right e.g. 5-10bp using bedtools slop

(3) Use tools that work directly on the wig or bedgraph signal (filtered down to your peak bed regions), then identify local maxima and extend left and right as in (2), e.g. the routine bwtool find might be of use here.

Any other ideas?

chip-seq wig peak calling • 1.3k views
ADD COMMENTlink modified 2.2 years ago by Biostar ♦♦ 20 • written 2.8 years ago by Sergio Martínez Cuesta170

Before discussing the technical side, what makes you think that (beyond eyeball inspection) these "subpeaks" are indeed meaningful and not just noise between significant events? What kind of data are this, ATAC/ChIP-seq?

ADD REPLYlink written 2.8 years ago by ATpoint46k

Good point, we do not know if they are meaningful / noise. These are ChIP-seq datasets. I think it comes down to the issue of finding narrower peaks to the ones found so far with existing arguments. Perhaps tweaking arguments in the peak callers would give me the finer detail I am looking for.

ADD REPLYlink written 2.8 years ago by Sergio Martínez Cuesta170

What's the marker of interest? TFs bind in narrow regions, obviously, whilst other markers 'spread' across a large region, even up to megabases in some cases

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Kevin Blighe70k
gravatar for Alex Reynolds
2.8 years ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

You might take a look at Eric Rynes' hotspot2. It applies a rigorous statistical approach to calling enriched regions.

ADD COMMENTlink written 2.8 years ago by Alex Reynolds31k
gravatar for Rashedul Islam
2.8 years ago by
Rashedul Islam390 wrote:

You can use FindER peak caller. This tool has the options for selecting minimum and maximum peak size. Along with it you can select the distance between two peaks to be merged together.

ADD COMMENTlink written 2.8 years ago by Rashedul Islam390

Hi, Rashedul

I was wondering if you can give the link of FindER's paper? I just can't find it, thanks!


ADD REPLYlink written 21 months ago by ben.kunfang30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1396 users visited in the last hour