Question: Duplicated ATAC-Seq peaks called by MACS2
0
gravatar for Gary
22 months ago by
Gary480
Taiwan/Taichung/China Medical University Hospital
Gary480 wrote:

Hi,

I use MACS2 to do peak calling for ATAC-Seq data using the command below. However I found that some duplicated peaks called by MASC2, e.g. ChineseBulbulHead_peak_23a, ChineseBulbulHead_peak_23b, & ChineseBulbulHead_peak_23c (the detail below). I don’t meet the same problem before. Could you help me how to deal with this issue? Many thanks.

Best,

Gary

The MACS2 command I sued

macs2 callpeak -t ChineseBulbulHead.bam --format BAMPE --gsize 1000000000 --keep-dup 1 --outdir ChineseBulbulHead --name ChineseBulbulHead --verbose 3 --qvalue 0.00001 --cutoff-analysis --call-summits

The part result of ChineseBulbulHead_peaks.narrowPeak

chr1    876210  876801  ChineseBulbulHead_peak_18a  137 .   7.66032 16.78457    13.76007    143
chr1    876210  876801  ChineseBulbulHead_peak_18b  191 .   9.25623 22.30006    19.12043    517
chr1    934111  934340  ChineseBulbulHead_peak_19   117 .   7.02196 14.69713    11.74247    82
chr1    987537  987972  ChineseBulbulHead_peak_20a  400 .   12.07194    43.64410    40.03254    107
chr1    987537  987972  ChineseBulbulHead_peak_20b  295 .   9.97071 33.01239    29.59148    329
chr1    1118363 1119105 ChineseBulbulHead_peak_21   863 .   14.59966    90.58256    86.34797    330
chr1    1183894 1184262 ChineseBulbulHead_peak_22   175 .   6.85205 20.72878    17.59063    115
chr1    1509040 1510513 ChineseBulbulHead_peak_23a  204 .   5.30448 23.65111    20.43788    98
chr1    1509040 1510513 ChineseBulbulHead_peak_23b  529 .   8.87295 56.72340    52.91084    546
chr1    1509040 1510513 ChineseBulbulHead_peak_23c  291 .   6.36538 32.60884    29.19571    733
chr1    1562249 1562717 ChineseBulbulHead_peak_24   340 .   11.72668    37.55609    34.05026    149
chr1    1918313 1918729 ChineseBulbulHead_peak_25   351 .   11.95111    38.69747    35.17159    279
chr1    2086004 2086223 ChineseBulbulHead_peak_26   186 .   7.50532 21.82297    18.65607    99
chr1    2807646 2808024 ChineseBulbulHead_peak_27a  53  .   4.78770 8.04636 5.38921 65
chr1    2807646 2808024 ChineseBulbulHead_peak_27b  169 .   8.61786 20.04613    16.92561    287
ADD COMMENTlink modified 22 months ago by geek_y11k • written 22 months ago by Gary480
3
gravatar for geek_y
22 months ago by
geek_y11k
Barcelona
geek_y11k wrote:

Its not a duplicated peak. Its because of --call-summits and the nature of ATAC data.

From the manual:

--call-summits

MACS will now reanalyze the shape of signal profile (p or q-score depending on cutoff setting) to deconvolve subpeaks within each peak called from general procedure. It's highly recommended to detect adjacent binding events. While used, the output subpeaks of a big peak region will have the same peak boundaries, and different scores and peak summit positions.

ADD COMMENTlink modified 22 months ago • written 22 months ago by geek_y11k

Hi, geek_y. Thank you so much. I rerun MACS2 without the --call-summits parameter and no subpeaks have the same peak boundaries. Do you mean it is because (1) I run the --call-summits parameter, (2) ATAC-Seq have many big peaks? However, I don't really understand the manual. Could you explain it more clearly for me? Thank you very very much.

ADD REPLYlink written 22 months ago by Gary480
3

Hi, I would suggest to read a bit more about ATAC-Seq data, why it has sub-peaks, why ATAC has properties of both TFs and histone data, nucleosome positioning etc.

ADD REPLYlink written 22 months ago by geek_y11k

Hi geek_y, Many thanks for your suggestion. It is very helpful. Using the ChineseBulbulHead_peak_18a & 18b as an example (the figure below), I find that there are two sub-peaks. The first track is a bed file using MACS2 command below without the --call-summit parameter.

macs2 callpeak -t ChineseBulbulHead.bam --format BAMPE --gsize 1000000000 --keep-dup 1 --outdir MACS2_ChineseBulbulHead --name ChineseBulbulHead --verbose 3 --qvalue 0.00001 --cutoff-analysis

The second track is a narrowPeak file using MACS2 command below with the --call-summit parameter.

macs2 callpeak -t ChineseBulbulHead.bam --format BAMPE --gsize 1000000000 --keep-dup 1 --outdir SummitsChineseBulbulHead --name ChineseBulbulHead --verbose 3 --qvalue 0.00001 --cutoff-analysis --call-summits

The third track is a bigwig file using deepTools2 command below with the –extendReads=20. It means that I artificially add 200bp for each read.

bamCoverage --bam ChineseBulbulHead.bam --outFileName ChineseBulbulHead.bigwig --outFileFormat=bigwig --scaleFactor=1 --binSize=1 --extendReads=200 --numberOfProcessors=6 --normalizeUsingRPKM –verbose

The fourth and fifth tracks are bam files. The coverage of the bam file is more like two peaks. The bigwig file is more like a single peak.

May I say that it is because my low number of reads (~ 26 million 87bp paired-end reads) causes two sub-peaks. If I sequence more reads, the two sub-peaks will merge into single one peak, such as my bigwig file? Many thanks.

enter image description here

ADD REPLYlink modified 21 months ago • written 21 months ago by Gary480
1

If you want to get one peak, don't use --call-summits.

ADD REPLYlink written 21 months ago by geek_y11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1756 users visited in the last hour