Question: MACS2 option for ATAC seq?
gravatar for star
13 months ago by
star230 wrote:

I have some ATAC-seq PE samples and Dnase-seq SE samples, I like to do peak calling for them using MACS2. I run MACS2 command with --nomodel -- shift -100 -- extsize 200 For DNase samples but for ATAc seq samples I am not sure which one of --nomodel -- shift -100 -- extsize 200 or only use -f BAMPE options of MACS2 is better?

Also, I found in ENDCODE part 2a, that use bed file but there 2 files BEDPEand TAG file

-tagAlign file ${FINAL_TA_FILE}

-BEDPE file (with read pairs on each line) ${FINAL_BEDPE_FILE}

-Subsampled tagAlign file for CC analysis ${SUBSAMPLED_TA_FILE}

that I am not sure which one should be use as Input of MACS2. should I use TAG file as Input?

ADD COMMENTlink modified 13 months ago • written 13 months ago by star230

Please try using the search function as the answers to your question have been presented in the past: ATAC-seq peak calling with MACS

ADD REPLYlink written 13 months ago by benformatics1.5k
gravatar for benformatics
13 months ago by
ETH Zurich
benformatics1.5k wrote:

The official Harvard guidelines as of this year suggest the following:

Our previous recommendation was to run MACS2 with -f BAMPE, which is similar to the default analysis mode of Genrich (inferring full fragments, rather than cut site intervals). Others have attempted to interpret cut site intervals with MACS2 by using the --shift and --extsize arguments, but these arguments are ignored in BAMPE mode. They do work in the default (BAM) mode, but then, with paired-end reads, most of the alignments are automatically discarded (half of the properly paired alignments and all of the unpaired alignments; secondary alignments are never considered). Is it worse to interpret full fragments that may be less informative biologically, or to disregard more than half of the sequence data? A complicated question. The correct answer is: use Genrich.

The main issue with --nomodel -- shift -100 -- extsize 200 -f BAM is that you lose a lot of reads (approx 50%). As the MACS2 manual states specifically that for the -f BAM parameter:

If the BAM file is generated for paired-end data, MACS will only keep the left mate(5' end) tag."

So you lose all your 3' information.

If you are interested you could read the old Harvard guidelines which do cover the optimal parameters for using MACS2.

ADD COMMENTlink written 13 months ago by benformatics1.5k

Thanks for your reply.

So, as I understood, If I use BAM file it is better to use -f BAMPE but If I change BAM file to BED file I can use both -f BEDPE and --nomodel -- shift -100 -- extsize 200 -f BED. Am I right?

ADD REPLYlink written 13 months ago by star230

I think so. When I use --nomodel -- shift -100 -- extsize 200 -f BED I have 1.5 times more peaks compared to --nomodel -- shift -100 -- extsize 200 -f BAM.

ADD REPLYlink modified 5 months ago • written 5 months ago by p4alindromic10

Hi- My 2p, I just started using Genrich on ATAC-seq with replicates and my first impression is actually very good when comparing it to MACS2 (I may post a better explanation at some point...).

ADD REPLYlink written 13 months ago by dariober11k

What exactly is the issue with macs2, can you post the command you use for it?

ADD REPLYlink written 12 months ago by ATpoint32k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1247 users visited in the last hour