Overlapping And Unique Peaks In Chipseq
Entering edit mode
11.1 years ago
Diana ▴ 910

Hi all,

I am currently analyzing chip-seq data for two TFs. I want to find out overlapping as well as unique peaks between these two TFs. Im using the package ChIPpeakAnno. The thing is I have this code in R that finds the overlapping peaks but I can't seem to get the unique ones out of it:

  t1 = findOverlappingPeaks(peaks1, peaks2, maxgap=2000,NameOfPeaks1="TF1", NameOfPeaks2="TF2", select="all", annotate=1, multiple=TRUE)

overlapping_peaks=overlap(TF1, TF2)
TF1-TF2= TF1[ !(overlapping_peaks),]

I can't seem to make this ! work. I want to exclude all the peaks which overlap and get all the ones unique in TF1.

Any suggestions on how to make it work. Should I use match somehow?

The files look like this:

TF1 and TF2 look like this:

chr              start              end
1 chr1         85571823       85575093
2 chr1         78834350       78837116
3 chr1          2872978        2875530
4 chr1         50641235       50643585
5 chr1         37698343       37699420
6 chr1         37698343       37699420

overlapping_peaks file:

 chr        start       end
1 chr1  79092406  79094792
2 chr1 198383685 198385392
3 chr1 200988993 200993982
4 chr1  63836348  63842294

Thanks a lot!!!

r chipseq • 9.4k views
Entering edit mode
11.1 years ago
Matt LaFave ▴ 310

Hi Diana -

I'm sure there is a way to get this done in R, but the first thing that springs to my mind is to massage the peak data into BED files and use bedtools. I believe this would give you the peaks in TF1 that don't overlap TF2:

bedtools intersect -v -a TF1.bed -b TF2.bed
Entering edit mode
11.1 years ago

I have a suggestion in the same vein as Matt LeFave; use the DiffBind R package which has good functionality for not only finding common and unique peaks, but also convenient plotting and statistical testing for differential binding.

Entering edit mode
11.1 years ago

I am currently analyzing chip-seq data for two TFs. I want to find out overlapping as well as unique peaks between these two TFs.

If you can put your data into UCSC BED format, the BEDOPS bedmap and BEDOPS bedops applications are useful for answering these sorts of questions.

Here is a graphical overview of bedmap:

bedmap overview

In your case, your TF calls are your reference regions. Your ChIP-seq peaks are map elements. The operation to use is --echo-map, which will report all overlapping peaks for a given set of transcription factor calls.

To answer your two questions, we will apply bedmap twice, once for each TF call set, and then do set operations with bedops between the results.

To start, let's say your TF calls are in sorted BED files called TF1.bed and TF2.bed. Your ChIP-seq peaks are in a sorted file called ChIP.bed. You can run the following commands to get per-TF-set overlapping peaks in files called allPeaks.TF1.bed and allPeaks.TF2.bed:

$ bedmap --echo-map --multidelim '\n' TF1.bed ChIP.bed \
    | sort-bed - \
    > allPeaks.TF1.bed

$ bedmap --echo-map --multidelim '\n' TF2.bed ChIP.bed \
    | sort-bed - \
    > allPeaks.TF2.bed

The results in allPeaks.TF1.bed and allPeaks.TF2.bed are organized as follows:

[ peak-1-1 ]  \n  [ peak-1-2 ] \n ...
[ peak-2-1 ]  \n  [ peak-2-2 ] \n ...
[ peak-N-1 ]  \n  [ peak-N-2 ] \n ...

In other words, for the first transcription factor in TF1.bed or TF2.bed, there is a list of ChIP peaks that overlap it separated by newlines, and so on.

Note that this output is a sorted BED file — we can pipe this result straight into bedops to do set operations without any extra work!

We can now answer your second question: peaks unique to either factor.

To get the list of peaks unique to TF1, between TF1 and TF2, we apply the --not-element-of set operation on result files for TF1 and TF2:

$ bedops --not-element-of -1 allPeaks.TF1.bed allPeaks.TF2.bed > uniquePeaks.TF1.bed

We repeat this to answer the question in the other direction, to get a list of peaks unique to TF2, between TF1 and TF2:

$ bedops --not-element-of -1 allPeaks.TF2.bed allPeaks.TF1.bed > uniquePeaks.TF2.bed

The bedops application has lots of set operations; here is a diagram of a selection of them:

bedops overview

Feel free to check out the BEDOPS site and post any questions here or on our user forum. Hope this helps!

Entering edit mode

I digged bedpost little bit. Its a great program in terms of finding overlaps of my chip seq peaks. However, I couldnt find how to implement peak density values to the bedops. Those values are also as important as the locations of the peaks. Could you help me about that ?

Entering edit mode
11.1 years ago

I have not used the ChIPpeakAnno package, but I can tell you that this can be done quite easily on Galaxy. There are intersect, subtract, and merge tools under the Operate on Genomic Intervals tab.


Login before adding your answer.

Traffic: 2839 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6