Question

MACS2 --nolambda fold enrichment cutoff

0

Entering edit mode

4.2 years ago

emre.cto • 0

In the output file of MACS2, the fold enrichment normally refers to enrichment of summit to local lambda model. I am using --nolambda option for ATAC-seq analysis (recommended as there is no input file like Chip-seq). Is the output file now showing enrichment with respect to global lambda? If so, does anyone have a suggestion for a reasonable cutoff?

macs2 ATAC-seq • 3.1k views

ADD COMMENT • link updated 4.2 years ago by ATpoint 82k • written 4.2 years ago by emre.cto • 0

score 2 · Answer 1 · 2020-02-27

2

Entering edit mode

4.2 years ago

ATpoint 82k

The whole point of the local lambda is to estimate the local background in the very situation that no input file is available. --nolambda turns this off and is imho a terrible choice for any peak calling situation as it then assumes uniform background across the genome. You will get plenty of false-positives with low enrichments. Still, due to GC bias, PCR bias, mappability etc. the true background is everything but uniform. I strongly recommend not to use this option. Please read the original MACS paper for more details on how lambda works in the absence of input controls. Can you link where exactly you got the recommendation from to use --nolambda?

ADD COMMENT • link 4.2 years ago by ATpoint 82k

0

Entering edit mode

After trying both with --nolambda and default, I agree with you that there are many false positive looking peaks. I forgot where I saw the suggestion first but many papers use this option. Including the paper for HMMRATAC program written by the same group who developed MACS (they compare with MACS2 of course):

"MACS2 was run with the local lambda option turned off (option –nolambda) and was run with q-value cutoffs (option -q) set to 0.5, 0.1, 0.05 (default), 0.005, 0.0005 and 0.00005. It was also run with a P-value cutoff (option -p) set to 0.6 and 0.3." https://academic.oup.com/nar/article/47/16/e91/5519166

Some other examples: https://www.nature.com/articles/s41587-019-0332-7 https://www.sciencedirect.com/science/article/pii/S2211124719311519?via%3Dihub

ADD REPLY • link 4.2 years ago by emre.cto • 0

0

Entering edit mode

The first link tests different parameters for benchmarking, the other two are single-cell ATAC-seq where you intrinsically have super-low coverage. This does not apply to standard ATAC-seq. I suggest you leave it on. Alternatively, if your samples have a good sequencing depth like 20mio or so reads I found the Genrich peak caller quite useful and producing good results.

ADD REPLY • link 4.2 years ago by ATpoint 82k

0

Entering edit mode

Thanks, but I didn't understand your argument about single-cell ATAC-seq. The coverage per cell or per cluster can be low but peak calling is done using all reads which treats the sample like bulk. Then the reads are counted per peak per cell. So there is no reason why --nolambda shouldn't produce false positive results in single-cell ATAC-seq. But perhaps this is seen as a trade-off for capturing potentially true but low peaks?

ADD REPLY • link 4.2 years ago by emre.cto • 0

0

Entering edit mode

I do not have hands-on experience with scATAC, but I guess if certain peaks are exclusive for a small subpopulation then coverage is still super low, probably coverage is in general lower, and probably (in the case of subpopulations) then one needs to turn the background model off to even have a chance to get peaks in these populations, but this is only guessing. As said for standard ATAC-seq, I'd leave it on.

ADD REPLY • link 4.2 years ago by ATpoint 82k