Determine peak summits after IDR analysis
1
1
Entering edit mode
5.6 years ago
Lucy ▴ 140

Hi,

I have performed ATAC-seq on three different cell subsets and have generated an optimal peak list for each using IDR analysis. I then ran bedtools merge to generate a global peak list for all of the subsets combined. However, by merging the peaks, I lose the summit information i.e. some peaks have multiple summits, but this information is lost when I run bedtools merge as these peaks are merged together. Does anyone have any ideas about how to overcome this? I would like to obtain the peak summits in order to perform motif analysis.

Many thanks,

Lucy

macs2 idr peak summits • 2.4k views
ADD COMMENT
0
Entering edit mode

macs2 was used for the initial peak calling

ADD REPLY
2
Entering edit mode
5.6 years ago

Use awk to grab the summit column from a peak file and put it into the ID field (fifth column):

$ awk '{ print ... }' A > A.custom

Repeat for inputs B and C.

Map the merged custom intervals to the customized intervals, to get the IDs contained with the merged interval:

$ bedmap --echo --echo-map-id --delim '\t' <(bedops --merge A.custom B.custom C.custom) <(bedops --everything A.custom B.custom C.custom) > answer.bed
ADD COMMENT
0
Entering edit mode

Thank you. Please could you explain the first part in more detail. I would take column 10 from my .narrowPeaks file, which is the relative summit position and place this into the fifth column of which file?

ADD REPLY
0
Entering edit mode

If A.bed is your narrowPeak file:

$ awk -vOFS="\t" -vFS="\t" '{ print $1, $2, $3, $4"%%"$10 }' A.bed > A.custom.bed

The ID of A.custom.bed contains the original peak ID, concatenated with the summit position, using %% as a delimiter.

Repeat for peak files B.bed and C.bed.

Once you do the bedmap --echo-map-id step, the result will contain the merged intervals of peaks from the three peak files, along with a fourth column containing all the modified IDs that contain the peak summits, which overlap that merged interval.

Those IDs can be parsed out to get the summit positions that fall within the merged interval.

ADD REPLY

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6