I am trying to call peaks on my ATAC data using MACS2 and realized that there is a huge inconsistency within the number of reads in my bam file and the total tags reported by MACS2.
I have seen previous posts that mention using "--keep-dup 'all'" option but I have deduplicated my bam using picard before macs. Below is the command am using and the outputs for macs and samtools flagstat.
macs2 callpeak -t PrimaryKeratinocytes.suf.bam -g hs -n test --nomodel --shift -51 --extsize 102 --q 0.05 --keep-dup 'all'
macs2
INFO @ Fri, 17 Aug 2018 14:53:58: #1 tag size is determined as 51 bps
INFO @ Fri, 17 Aug 2018 14:53:58: #1 tag size = 51
INFO @ Fri, 17 Aug 2018 14:53:58: #1 total tags in treatment: 37171037
flagstat
52164544 + 0 in total (QC-passed reads + QC-failed reads)
14993507 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
52164544 + 0 mapped (100.00% : N/A)
Why is macs using only 37 million reads? How can I include all the reads for peak calling?
Thank you
Thank you for the editing @genomax