I am using Diffbind to call differential peaks on an ATAC seq dataset of condition A vs B each having 3 replicates.
The total number of peaks post union is - 106700 and of these 33197 (~30%) come out as significant even with a stringent FDR cutoff of 0.01.
Is this high proportion of significant peaks expected? I do understand it depends on the data but still the number seems very high. Should any additional steps should be added to the protocol below to increase the specificity?
The protocol I followed was as follows:-
Peaks were called for replicate individually with macs2 using the -format BAMPE to use the fragment size information in BAM for extension(reads are paired-end). The p-value cutoff here was kept lenient at 0.01 to allow more peaks early on.
DESeq2 was then used within Diffbind for normalization and finding the differential peaks.
--summitsoption was not used in Diffbind as size is not a particular concern.