Question: How do I select top 100,000 non-overlapping peaks from MACS2 narrow peaks output?
0
gravatar for YaGalbi
3.7 years ago by
YaGalbi1.5k
Biocomputing, MRC Harwell Institute, Oxford, UK
YaGalbi1.5k wrote:

Hello everyone,

I've just finished MACS2 narrow peak calling ATAC-seq data. With a cut-off q-value of 0.05 I have around 200K peaks per sample. My literature review suggests only the top 50,000 (some say 100,000) non-overlapping peaks are included in downstream analysis.

From the authors of the ATAC-seq protocol:

"Using the filtered peak set, peak summits were extended +/-250 bps. The top 50,000 non-overlapping 500bp summits, which we refer to as accessibility peaks were used for all downstream analysis."

Conceptually I get the reasoning, there is no need to have 1000s of peaks fall in the same 500bp window so remove the overlaps.

However, no authors state how they rank the top 100,000. Is it by -log10(qvalue) or is it by number of reads within the 500bp window? Does it make a difference which one I use?

It would be easier to use -log10(qvalue) as it is right there in the same narrowPeaks file with positions. I do realize I can be more strict with the q-value but I think that will not be enough to cut down to 100,000 peaks.

Thanks for your input

Kenneth

narrow macs2 overlapping atac peaks • 2.1k views
ADD COMMENTlink modified 3.5 years ago by Biostar ♦♦ 20 • written 3.7 years ago by YaGalbi1.5k

maybe you should merge many overlapping peak into one large peak. or you can ask the author of MACS2.

ADD REPLYlink written 3.7 years ago by Ben50

Yes Ben I will be doing that.... however a ranking is still required to decide which peak to choose to keep.

ADD REPLYlink written 3.7 years ago by YaGalbi1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1853 users visited in the last hour