Question

Shifting or not when counting reads of ATAC-seq Peaks

1

Entering edit mode

5.4 years ago

Richard ▴ 30

The purpose I'm having is to find the regions with differential accessibility between groups. The common strategy I read is: 1) map and call peaks for each individual sample; 2) filter the peaks against the blacklist and get a union of all peaks across the samples; 3) Count reads mapped onto each peak and feed the count table to DESeq2/EdgeR or others.

For the first step I used MACS with the parameters --nomodel --shift -100 --extsize 200 to call open chromatin (breakpoint) peaks for my paired-end ATAC-seq data (although there're also quite a few other ways of calling peaks, for example some selected only fragments with short insert size, i.e., <=100/150bps and directly call peaks without shift). My understanding of the above shifting and extending is that in doing so the break point can be covered around the middle of the modified read.

For the counting step, I reads that different tools have been used (bedtools/featureCounts/deeptools/HT-seq). As the step is not difficult to perform, the (naive) question I'm having is whether one should shift and extend the reads as above before the counting? If not, the reads whose 5' end locate near the peak are not actually counted. These reads, however, might be incorporated as a support of the peak by MACS?

Thanks for any reply.

ChIP-Seq next-gen • 2.1k views

ADD COMMENT • link 5.4 years ago by Richard ▴ 30