I've installed and run the current ENCODE-DCC ATAC-seq pipeline from GitHub. I have 2 replicates of ATAC-seq data. When I run the pipeline, regardless of which (blacklist-filtered) peak file I look at (rep1, rep2, or pooled), about 6% of the peak loci are repeated (but with different scores). Most are repeated two or three times, but the worst-case has 7 entries!
Here's an example:
chr1    629086  630068  Peak_1  1000    .   5.22734 7335.73975  7327.66357  727
chr1    629086  630068  Peak_53 1000    .   1.69177 307.40005   301.90186   79
chr1    629086  630068  Peak_6  1000    .   3.27104 2558.00244  2551.49731  291
Does anyone have any idea what's going on here? & what I should do with these "extra" peaks?
We never figured this out - we just merged the peaks in the end, we don't use the scores after an initial filter anyway.
open an issue at the github page of the ATAC-seq pipeline